Latest Certified Success Dumps Download

CISCO, MICROSOFT, COMPTIA, HP, IBM, ORACLE, VMWARE
CCA-410 Examination questions (September)

Achieve New Updated (September) Cloudera CCA-410 Examination Questions 31-40

September 24, 2015

Ensurepass

 

QUESTION 31

What are the permissions of a file in HDFS with the following:rw-rw-r-x?

 

A.

HDFS runs in user space which makes all users with access to the namespace able to read, write and modify all files

B.

the owner and group cannot delete the file, but other can

C.

the owner and group can modify the contents of the file other can’t

D.

the owner and group can read the file other can’t

E.

No one can modify the content of the file

 

Answer: C

Explanation: The first set of 3 permissions (rw-) relate to the username, the second set of 3 permissions (rw-) relate to the usergroup and the final set of 3 permissions (r-x) relate to anyone else who is not associated with the username or groupname.

 

Each of the three sets of permissions are defined in the following manner; r = Read permissions

w = Write permissions

x = Execute permissions

 

 

 

 

 

 

QUESTION 32

Cluster Summary

 

45 files and directories, 12 blocks = 57 total. Heap Size is 15.31 MB / 193.38MB(7%)

 

clip_image001

 

Refer to the above screenshot.

 

You configure the Hadoop cluster with seven DataNodes and the NameNode’s web UI displays the details shown in the exhibit.

 

What does this tells you?

 

A.

The HDFS cluster is in the safe mode.

B.

Your cluster has lost all HDFS data which had blocks stored on the dead DataNode.

C.

One physical host crashed.

D.

The DataNode JVM on one host is not active.

 

Answer: A

Explanation: The data from the dead node is being replicated. The cluster is in safemode.

 

Note:

* Safemode

 

 

 

 

 

During start up Namenode loads the filesystem state from fsimage and edits log file. It then waits for datanodes to report their blocks so that it does not prematurely start replicating the blocks though enough replicas already exist in the cluster. During this time Namenode stays in safemode. A Safemode for Namenode is essentially a read-only mode for the HDFS cluster, where it does not allow any modifications to filesystem or blocks. Normally Namenode gets out of safemode automatically at the beginning. If required, HDFS could be placed in safemode explicitly using ‘bin/hadoop dfsadmin -safemode’ command. Namenode front page shows whether safemode is on or off. A more detailed description and configuration is maintained as JavaDoc for setSafeMode().

* Data Disk Failure, Heartbeats and Re-Replication Each DataNode sends a Heartbeat message to the NameNode periodically. A network partition can cause a subset of DataNodes to lose connectivity with the NameNode. The NameNode detects this condition by the absence of a Heartbeat message. The NameNode marks DataNodes without recent Heartbeats as dead and does not forward any new IO requests to them. Any data that was registered to a dead DataNode is not available to HDFS any more. DataNode death may cause the replication factor of some blocks to fall below their specified value. The NameNode constantly tracks which blocks need to be replicated and initiates replication whenever necessary. The necessity for re-replication may arise due to many reasons: a DataNode may become unavailable, a replica may become corrupted, a hard disk on a DataNode may fail, or the replication factor of a file may be increased.

 

* NameNode periodically receives a Heartbeat and a Blockreport from each of the DataNodes in the cluster. Receipt of a Heartbeat implies that the DataNode is functioning properly. A Blockreport contains a list of all blocks on a DataNode. When NameNode notices that it has not recieved a hearbeat message from a data node after a certain amount of time, the data node is marked as dead. Since blocks will be under replicated the system begins replicating the blocks that were stored on the dead datanode. The NameNode Orchestrates the replication of data blocks from one datanode to another. The replication data transfer happens directly between datanodes and the data never passes through the namenode.

 

Incorrrect answers:

B: The data is not lost, it is being replicated.

 

Reference: 24 Interview Questions & Answers for Hadoop MapReduce developers, How NameNode Handles data node failures?

 

 

 

 

 

 

QUESTION 33

In a cluster configured with HDFS High Availability (HA) but NOT HDFS federation, each map task run:

 

A.

In the same Java Virtual Machine as the DataNode.

B.

In the same Java Virtual Machine as the TaskTracker

C.

In its own Java Virtual Machine.

D.

In the same Java Virtual Machine as the JobTracker.

 

Answer: C

Explanation: A TaskTracker is a slave node daemon in the cluster that accepts tasks (Map, Reduce and Shuffle operations) from a JobTracker. There is only One Task Tracker process run on any hadoop slave node. Task Tracker runs on its own JVM process. Every TaskTracker is configured with a set of slots, these indicate the number of tasks that it can accept. The TaskTracker starts a separate JVM processes to do the actual work (called as Task Instance) this is to ensure that process failure does not take down the task tracker. The TaskTracker monitors these task instances, capturing the output and exit codes. When the Task instances finish, successfully or not, the task tracker notifies the JobTracker. The TaskTrackers also send out heartbeat messages to the JobTracker, usually every few minutes, to reassure the JobTracker that it is still alive. These message also inform the JobTracker of the number of available slots, so the JobTracker can stay up to date with where in the cluster work can be delegated.

 

Note: Despite this very high level of reliability, HDFS has always had a well-known single point of failure which impacts HDFS’s availability: the system relies on a single Name Node to coordinate access to the file system data. In clusters which are used exclusively for ETL or batch-processing workflows, a brief HDFS outage may not have immediate business impact on an organization; however, in the past few years we have seen HDFS begin to be used for more interactive workloads or, in the case of HBase, used to directly serve customer requests in real time. In cases such as this, an HDFS outage will immediately impact the productivity of internal users, and perhaps result in downtime visible to external users. For these reasons, adding high availability (HA) to the HDFS Name Node became one of the top priorities for the HDFS community.

 

Reference: 24 Interview Questions & Answers for Hadoop MapReduce developers , What is a Task Tracker in Hadoop? How many instances of TaskTracker run on a Hadoop Cluster

 

 

 

 

 

 

QUESTION 34

Your cluster is running Map v1 (MRv1), with default replication set to 3, and a cluster blocks 64MB. Identify which best describes the file read process when a Client application connects into the cluster and requests a 50MB file?

 

A.

The client queries the NameNode for the locations of the block, and reads all three copies. The first copy to complete transfer to the client is the one the client reads as part of Hadoop’s execution framework.

B.

The client queries the NameNode for the locations of the block, and reads from the first location in the list of receives.

C.

The client queries the NameNode for the locations of the block, and reads from a random location in the list it receives to eliminate network I/O loads by balancing which nodes it retrieves data from at any given time.

D.

The client queries the NameNode and then retrieves the block from the nearest DataNode to the client and then passes that block back to the client.

 

Answer: B

 

 

QUESTION 35

On a cluster running MapReduce v1 (MRv1), the value of the mapred.tasktracker.map.tasks.maximum configuration parameter in the mapred-site.xml file should be set to:

 

A.

Half the number of the maximum number of Reduce tasks which can run simultaneously on an individual node.

B.

The maximum number of Map tasks can run simultaneously on an individual node.

C.

The same value on each slave node.

D.

The maximum number of Map tasks which can run on the cluster as a whole.

E.

Half the number of the maximum number of Reduce tasks which can run on the cluster as a whole.

 

Answer: B

Explanation: mapred.tasktracker.map.tasks.maximum

Range: 1/2 * (cores/node) to 2 * (cores/node)

Description: Number of map tasks to deploy on each machine.

 

 

 

 

 

 

QUESTION 36

Which three file actions can execute as you write a file into HDFS?

 

A.

You can index the file

B.

You can update the files contents

C.

You can rename the file

D.

You can delete the file

E.

You can move the file

 

Answer: CDE

Explanation: CD: When a file is deleted by a user or an application, it is not immediately removed from HDFS. Instead, HDFS first renames it to a file in the /trash directory. The file can be restored quickly as long as it remains in /trash. A file remains in /trash for a configurable amount of time. After the expiry of its life in /trash, the NameNode deletes the file from the HDFS namespace. The deletion of a file causes the blocks associated with the file to be freed. Note that there could be an appreciable time delay between the time a file is deleted by a user and the time of the corresponding increase in free space in HDFS.

 

DE: fsck

Runs a HDFS filesystem checking utility.

COMMAND_OPTION include:

* -move

Move corrupted files to /lost+found

* -delete

Delete corrupted files.

 

 

QUESTION 37

In HDFS, you view a file with rw-r–r– set as its permissions. What does this tell you about the file?

 

A.

The file cannot be deleted by anyone but the owner

B.

The file cannot be deleted by anyone

C.

The file cannot be run as a MapReduce job

 

 

 

 

D.

The file’s contents can be modified by the owner, but no-one else

E.

As a Filesystem in Userspace (FUSE), HDFS files are available to all user’s on a cluster regardless of their underlying POSIX permissions.

 

Answer: A

 

 

QUESTION 38

How does HDFS Federation help HDFS Scale horizontally?

 

A.

HDFS Federation improves the resiliency of HDFS in the face of network issues by removing the NameNode as a single-point-of-failure.

B.

HDFS Federation allows the Standby NameNode to automatically resume the services of an active NameNode.

C.

HDFS Federation provides cross-data center (non-local) support for HDFS, allowing a cluster administrator to split the Block Storage outside the local cluster.

D.

HDFS Federation reduces the load on any single NameNode by using the multiple, independent NameNode to manage individual pars of the filesystem namespace.

 

Answer: D

Explanation: HDFS FederationIn order to scale the name service horizontally, federation uses multiple independent Namenodes/Namespaces. The Namenodes are federated, that is, the Namenodes are independent and don’t require coordination with each other. The datanodes are used as common storage for blocks by all the Namenodes. Each datanode registers with all the Namenodes in the cluster. Datanodes send periodic heartbeats and block reports and handles commands from the Namenodes.

 

Reference: Apache Hadoop 2.0.2-alpha

 

http://hadoop.apache.org/docs/current/

 

 

QUESTION 39

Which two features does Kerberos security add to a Hadoop cluster?

 

A.

Authentication for user access to the cluster against a central server

 

 

 

 

B.

Encryption for data on disk (“at rest”)

C.

Encryption on all remote procedure calls (RPCs)

D.

User authentication on all remote procedure calls (RPcs)

E.

Root access to the cluster for users hdfs and mapred but non-root acess clients

 

Answer: AD

Explanation:Hadoop can use the Kerberos protocol to ensure that when someone makes a request, they really are who they say they are. This mechanism is used throughout the cluster. In a secure Hadoop configuration, all of the Hadoop daemons use Kerberos to perform mutual authentication, which means that when two daemons talk to each other, they each make sure that the other daemon is who it says it is. Additionally, this allows the NameNode and JobTracker to ensure that any HDFS or MR requests are being executed with the appropriate authorization level.

 

Reference: Documentation CDH3 Documentation CDH3 Security Guide, Introduction to Hadoop Security

 

 

QUESTION 40

Choose which best describe a Hadoop cluster’s block size storage parameters once you set the HDFS default block size to 64MB?

 

A.

The block size of files in the cluster can be determined as the block is written.

B.

The block size of files in the Cluster will all be multiples of 64MB.

C.

The block size of files in the duster will all at least be 64MB.

D.

The block size of files in the cluster will all be the exactly 64MB.

 

Answer: D

Explanation:

Note: What is HDFS Block size? How is it different from traditional file system block size?

 

In HDFS data is split into blocks and distributed across multiple nodes in the cluster. Each block is typically 64Mb or 128Mb in size. Each block is replicated multiple times. Default is to replicate each block three times. Replicas are stored on different nodes. HDFS utilizes the local file system to store each HDFS block as a separate file. HDFS Block size can not be compared with the traditional file system block size.

 

 

Free VCE & PDF File for Cloudera CCA-410 Real Exam

Instant Access to Free VCE Files: CompTIA | VMware | SAP …
Instant Access to Free PDF Files: CompTIA | VMware | SAP …

 >=”cursor: auto; margin: 0cm 0cm 0pt; line-height: normal; text-autospace: ; mso-layout-grid-align: none” align=”left”> 

QUESTION 31

What are the permissions of a file in HDFS with the following:rw-rw-r-x?

 

A.

HDFS runs in user space which makes all users with access to the namespace able to read, write and modify all files

B.

the owner and group cannot delete the file, but other can

C.

the owner and group can modify the contents of the file other can’t

D.

the owner and group can read the file other can’t

E.

No one can modify the content of the file

 

Answer: C

Explanation: The first set of 3 permissions (rw-) relate to the username, the second set of 3 permissions (rw-) relate to the usergroup and the final set of 3 permissions (r-x) relate to anyone else who is not associated with the username or groupname.

 

Each of the three sets of permissions are defined in the following manner; r = Read permissions

w = Write permissions

x = Execute permissions

 

 

 

 

 

 

QUESTION 32

Cluster Summary

 

45 files and directories, 12 blocks = 57 total. Heap Size is 15.31 MB / 193.38MB(7%)

 

clip_image001

 

Refer to the above screenshot.

 

You configure the Hadoop cluster with seven DataNodes and the NameNode’s web UI displays the details shown in the exhibit.

 

What does this tells you?

 

A.

The HDFS cluster is in the safe mode.

B.

Your cluster has lost all HDFS data which had blocks stored on the dead DataNode.

C.

One physical host crashed.

D.

The DataNode JVM on one host is not active.

 

Answer: A

Explanation: The data from the dead node is being replicated. The cluster is in safemode.

 

Note:

* Safemode

 

 

 

 

 

During start up Namenode loads the filesystem state from fsimage and edits log file. It then waits for datanodes to report their blocks so that it does not prematurely start replicating the blocks though enough replicas already exist in the cluster. During this time Namenode stays in safemode. A Safemode for Namenode is essentially a read-only mode for the HDFS cluster, where it does not allow any modifications to filesystem or blocks. Normally Namenode gets out of safemode automatically at the beginning. If required, HDFS could be placed in safemode explicitly using ‘bin/hadoop dfsadmin -safemode’ command. Namenode front page shows whether safemode is on or off. A more detailed description and configuration is maintained as JavaDoc for setSafeMode().

* Data Disk Failure, Heartbeats and Re-Replication Each DataNode sends a Heartbeat message to the NameNode periodically. A network partition can cause a subset of DataNodes to lose connectivity with the NameNode. The NameNode detects this condition by the absence of a Heartbeat message. The NameNode marks DataNodes without recent Heartbeats as dead and does not forward any new IO requests to them. Any data that was registered to a dead DataNode is not available to HDFS any more. DataNode death may cause the replication factor of some blocks to fall below their specified value. The NameNode constantly tracks which blocks need to be replicated and initiates replication whenever necessary. The necessity for re-replication may arise due to many reasons: a DataNode may become unavailable, a replica may become corrupted, a hard disk on a DataNode may fail, or the replication factor of a file may be increased.

 

* NameNode periodically receives a Heartbeat and a Blockreport from each of the DataNodes in the cluster. Receipt of a Heartbeat implies that the DataNode is functioning properly. A Blockreport contains a list of all blocks on a DataNode. When NameNode notices that it has not recieved a hearbeat message from a data node after a certain amount of time, the data node is marked as dead. Since blocks will be under replicated the system begins replicating the blocks that were stored on the dead datanode. The NameNode Orchestrates the replication of data blocks from one datanode to another. The replication data transfer happens directly between datanodes and the data never passes through the namenode.

 

Incorrrect answers:

B: The data is not lost, it is being replicated.

 

Reference: 24 Interview Questions & Answers for Hadoop MapReduce developers, How NameNode Handles data node failures?

 

 

 

 

 

 

QUESTION 33

In a cluster configured with HDFS High Availability (HA) but NOT HDFS federation, each map task run:

 

A.

In the same Java Virtual Machine as the DataNode.

B.

In the same Java Virtual Machine as the TaskTracker

C.

In its own Java Virtual Machine.

D.

In the same Java Virtual Machine as the JobTracker.

 

Answer: C

Explanation: A TaskTracker is a slave node daemon in the cluster that accepts tasks (Map, Reduce and Shuffle operations) from a JobTracker. There is only One Task Tracker process run on any hadoop slave node. Task Tracker runs on its own JVM process. Every TaskTracker is configured with a set of slots, these indicate the number of tasks that it can accept. The TaskTracker starts a separate JVM processes to do the actual work (called as Task Instance) this is to ensure that process failure does not take down the task tracker. The TaskTracker monitors these task instances, capturing the output and exit codes. When the Task instances finish, successfully or not, the task tracker notifies the JobTracker. The TaskTrackers also send out heartbeat messages to the JobTracker, usually every few minutes, to reassure the JobTracker that it is still alive. These message also inform the JobTracker of the number of available slots, so the JobTracker can stay up to date with where in the cluster work can be delegated.

 

Note: Despite this very high level of reliability, HDFS has always had a well-known single point of failure which impacts HDFS’s availability: the system relies on a single Name Node to coordinate access to the file system data. In clusters which are used exclusively for ETL or batch-processing workflows, a brief HDFS outage may not have immediate business impact on an organization; however, in the past few years we have seen HDFS begin to be used for more interactive workloads or, in the case of HBase, used to directly serve customer requests in real time. In cases such as this, an HDFS outage will immediately impact the productivity of internal users, and perhaps result in downtime visible to external users. For these reasons, adding high availability (HA) to the HDFS Name Node became one of the top priorities for the HDFS community.

 

Reference: 24 Interview Questions & Answers for Hadoop MapReduce developers , What is a Task Tracker in Hadoop? How many instances of TaskTracker run on a Hadoop Cluster

 

 

 

 

 

 

QUESTION 34

Your cluster is running Map v1 (MRv1), with default replication set to 3, and a cluster blocks 64MB. Identify which best describes the file read process when a Client application connects into the cluster and requests a 50MB file?

 

A.

The client queries the NameNode for the locations of the block, and reads all three copies. The first copy to complete transfer to the client is the one the client reads as part of Hadoop’s execution framework.

B.

The client queries the NameNode for the locations of the block, and reads from the first location in the list of receives.

C.

The client queries the NameNode for the locations of the block, and reads from a random location in the list it receives to eliminate network I/O loads by balancing which nodes it retrieves data from at any given time.

D.

The client queries the NameNode and then retrieves the block from the nearest DataNode to the client and then passes that block back to the client.

 

Answer: B

 

 

QUESTION 35

On a cluster running MapReduce v1 (MRv1), the value of the mapred.tasktracker.map.tasks.maximum configuration parameter in the mapred-site.xml file should be set to:

 

A.

Half the number of the maximum number of Reduce tasks which can run simultaneously on an individual node.

B.

The maximum number of Map tasks can run simultaneously on an individual node.

C.

The same value on each slave node.

D.

The maximum number of Map tasks which can run on the cluster as a whole.

E.

Half the number of the maximum number of Reduce tasks which can run on the cluster as a whole.

 

Answer: B

Explanation: mapred.tasktracker.map.tasks.maximum

Range: 1/2 * (cores/node) to 2 * (cores/node)

Description: Number of map tasks to deploy on each machine.

 

 

 

 

 

 

QUESTION 36

Which three file actions can execute as you write a file into HDFS?

 

A.

You can index the file

B.

You can update the files contents

C.

You can rename the file

D.

You can delete the file

E.

You can move the file

 

Answer: CDE

Explanation: CD: When a file is deleted by a user or an application, it is not immediately removed from HDFS. Instead, HDFS first renames it to a file in the /trash directory. The file can be restored quickly as long as it remains in /trash. A file remains in /trash for a configurable amount of time. After the expiry of its life in /trash, the NameNode deletes the file from the HDFS namespace. The deletion of a file causes the blocks associated with the file to be freed. Note that there could be an appreciable time delay between the time a file is deleted by a user and the time of the corresponding increase in free space in HDFS.

 

DE: fsck

Runs a HDFS filesystem checking utility.

COMMAND_OPTION include:

* -move

Move corrupted files to /lost+found

* -delete

Delete corrupted files.

 

 

QUESTION 37

In HDFS, you view a file with rw-r–r– set as its permissions. What does this tell you about the file?

 

A.

The file cannot be deleted by anyone but the owner

B.

The file cannot be deleted by anyone

C.

The file cannot be run as a MapReduce job

 

 

 

 

D.

The file’s contents can be modified by the owner, but no-one else

E.

As a Filesystem in Userspace (FUSE), HDFS files are available to all user’s on a cluster regardless of their underlying POSIX permissions.

 

Answer: A

 

 

QUESTION 38

How does HDFS Federation help HDFS Scale horizontally?

 

A.

HDFS Federation improves the resiliency of HDFS in the face of network issues by removing the NameNode as a single-point-of-failure.

B.

HDFS Federation allows the Standby NameNode to automatically resume the services of an active NameNode.

C.

HDFS Federation provides cross-data center (non-local) support for HDFS, allowing a cluster administrator to split the Block Storage outside the local cluster.

D.

HDFS Federation reduces the load on any single NameNode by using the multiple, independent NameNode to manage individual pars of the filesystem namespace.

 

Answer: D

Explanation: HDFS FederationIn order to scale the name service horizontally, federation uses multiple independent Namenodes/Namespaces. The Namenodes are federated, that is, the Namenodes are independent and don’t require coordination with each other. The datanodes are used as common storage for blocks by all the Namenodes. Each datanode registers with all the Namenodes in the cluster. Datanodes send periodic heartbeats and block reports and handles commands from the Namenodes.

 

Reference: Apache Hadoop 2.0.2-alpha

 

http://hadoop.apache.org/docs/current/

 

 

QUESTION 39

Which two features does Kerberos security add to a Hadoop cluster?

 

A.

Authentication for user access to the cluster against a central server

 

 

 

 

B.

Encryption for data on disk (“at rest”)

C.

Encryption on all remote procedure calls (RPCs)

D.

User authentication on all remote procedure calls (RPcs)

E.

Root access to the cluster for users hdfs and mapred but non-root acess clients

 

Answer: AD

Explanation:Hadoop can use the Kerberos protocol to ensure that when someone makes a request, they really are who they say they are. This mechanism is used throughout the cluster. In a secure Hadoop configuration, all of the Hadoop daemons use Kerberos to perform mutual authentication, which means that when two daemons talk to each other, they each make sure that the other daemon is who it says it is. Additionally, this allows the NameNode and JobTracker to ensure that any HDFS or MR requests are being executed with the appropriate authorization level.

 

Reference: Documentation CDH3 Documentation CDH3 Security Guide, Introduction to Hadoop Security

 

 

QUESTION 40

Choose which best describe a Hadoop cluster’s block size storage parameters once you set the HDFS default block size to 64MB?

 

A.

The block size of files in the cluster can be determined as the block is written.

B.

The block size of files in the Cluster will all be multiples of 64MB.

C.

The block size of files in the duster will all at least be 64MB.

D.

The block size of files in the cluster will all be the exactly 64MB.

 

Answer: D

Explanation:

Note: What is HDFS Block size? How is it different from traditional file system block size?

 

In HDFS data is split into blocks and distributed across multiple nodes in the cluster. Each block is typically 64Mb or 128Mb in size. Each block is replicated multiple times. Default is to replicate each block three times. Replicas are stored on different nodes. HDFS utilizes the local file system to store each HDFS block as a separate file. HDFS Block size can not be compared with the traditional file system block size.

 

 

Free VCE & PDF File for Cloudera CCA-410 Real Exam

Instant Access to Free VCE Files: CompTIA | VMware | SAP …
Instant Access to Free PDF Files: CompTIA | VMware | SAP …