Latest Certified Success Dumps Download

CISCO, MICROSOFT, COMPTIA, HP, IBM, ORACLE, VMWARE
CCA-410 Examination questions (September)

Achieve New Updated (September) Cloudera CCA-410 Examination Questions 41-50

September 24, 2015

Ensurepass

 

QUESTION 41

What metadata is stored on a DataNode when a block is written to it?

 

A.

None. Only the block itself is written.

B.

Checksums for the data in the block, as a separate file.

C.

Information on the file’s location in HDFS.

D.

Node location of each block belonging to the same namespace.

 

Answer: B

Reference: Protecting per-DataNode Metadata

 

 

QUESTION 42

What happens if a Mapper on one node goes into an infinite loop while running a MapReduce job?

 

A.

After a period of time, the JobTracker will restart the TaskTracker on the node on which the map task is running

B.

The Mapper will run indefinitely; the TaskTracker must be restarted to kill it

C.

The job will immediately fail.

D.

After a period of time, the TaskTracker will kill the Map Task.

 

Answer: D

Explanation: * The TaskTracker nodes are monitored. If they do not submit heartbeat signals often enough, they are deemed to have failed and the work is scheduled on a different TaskTracker.

* A TaskTracker will notify the JobTracker when a task fails. The JobTracker decides what to do then: it may resubmit the job elsewhere, it may mark that specific record as something to avoid, and it may may even blacklist the TaskTracker as unreliable.

 

 

QUESTION 43

 

Your cluster implements HDFS High Availability (HA). You two NameNodes are named nn01 and nn02. What occurs when you execute the command:

 

Hdfs haadmin -failover nn01 nn02

 

A.

nn02 becomes the standby NameNode and nn02 becomes the active NameNode

B.

Nn01 is fenced, and nn01 becomes the active NameNode

C.

Nn01 is fenced, and nn02 becomes the active NameNode

D.

Nn01 becomes the standby NameNode and nn02 becomes the active NameNode

 

Answer: C

Explanation: failover – initiate a failover between two NameNodes

 

This subcommand causes a failover from the first provided NameNode to the second. If the first NameNode is in the Standby state, this command simply transitions the second to the Active state without error. If the first NameNode is in the Active state, an attempt will be made to gracefully transition it to the Standby state. If this fails, the fencing methods (as configured by dfs.ha.fencing.methods) will be attempted in order until one of the methods succeeds. Only after this process will the second NameNode be transitioned to the Active state. If no fencing method succeeds, the second NameNode will not be transitioned to the Active state, and an error will be returned.

 

Reference: HDFS High Availability Administration, HA Administration using the haadmin command

 

 

QUESTION 44

Under which scenario would it be most appropriate to consider using faster (e.g 10 Gigabit) Ethernet as the network fabric for your Hadoop cluster?

 

A.

When the typical workloads generates a large amount of intermediate data, on the order of the input data itself.

B.

When the typical workloads consists of processor-intensive tasks.

C.

When the typical workloads consumes a large amount of input data, relative to the entire capacity of HDFS.

D.

When the typical workloads generates a large amount of output data, significantly larger than the amount of intermediate data.

 

Answer: A

 

 

Explanation: When we encounter applications that produce large amounts of intermediate datan the order of the same amount as is read ine recommend two ports on a single Ethernet card or two channel-bonded Ethernet cards to provide 2 Gbps per machine. Alternatively for customers who have already moved to 10 Gigabit Ethernet or Infiniband, these solutions can be used to address network bound workloads. Be sure that your operating system and BIOS are compatible if you’re considering switching to 10 Gigabit Ethernet.

 

Reference: Cloudera’s Support Team Shares Some Basic Hardware Recommendations

 

 

QUESTION 45

For a MapReduce job, on a cluster running MapReduce v1 (MRv1), what’s the relationship between tasks and task templates?

 

A.

There are always at least as many task attempts as there are tasks.

B.

There are always at most as many tasks attempts as there are tasks.

C.

There are always exactly as many task attempts as there are tasks.

D.

The developer sets the number of task attempts on job submission.

 

Answer: A

 

 

QUESTION 46

You have cluster running with the FIFO Scheduler enabled. You submit a large job A to the cluster, which you expect to run for one hour. Then, you submit job B to cluster, which you expect to run a couple of minutes only.

 

You submit both jobs with the same priority.

 

Which two best describes how the FIFO Scheduler arbitrates the cluster resources for a job and its tasks?

 

A.

Given Jobs A and B submitted in that order, all tasks from job A are guaranteed to finish before all tasks from job B.

B.

The order of execution of tasks within a job may vary.

 

 

 

 

C.

Tasks are scheduled in the order of their jobs’ submission.

D.

The FIFO Scheduler will give, on average, equal share of the cluster resources over the job lifecycle.

E.

Because there is more then a single job on the cluster, the FIFO Scheduler will enforce a limit on the percentage of resources allocated to a particular job at any given time.

F.

The FIFO Schedule will pass an exception back to the client when Job B is submitted, since all slots on the cluster are in use.

 

Answer: BC

 

 

QUESTION 47

Which command does Hadoop offer to discover missing or corrupt HDFS data?

 

A.

The map-only checksum utility,

B.

Fsck

C.

Du

D.

Dskchk

E.

Hadoop does not provide any tools to discover missing or corrupt data; there is no need because three replicas are kept for each data block.

 

Answer: B

Explanation: HDFS supports fsck command to check for various inconsistencies. It it is designed for reporting problems with various files, for e.g. missing blocks for a file or under replicated blocks. Unlike a traditional fsck utility for native filesystems, this command does not correct the errors it detects. Normally Namenode automatically corrects most of the recoverable failures. HDFS’ fsck is not a Hadoop shell command. It can be run as ‘bin/hadoop fsck’. Fsck can be run on the whole filesystem or on a subset of files.

 

Reference: Hadoop DFS User Guide

 

 

QUESTION 48

You are running two Hadoop clusters(cluster1 and cluster2), they run identical versions of hadoop.

 

You want to copy the data inside /home/foo/cluster1 to cluster2 into the directory /home/bar/

 

 

 

 

What is the correct distcp syntax to copy one directory tree from one cluster to the other cluster?

 

A.

$ distCp hdfs://cluster1/home/foo hdfs://cluster2/home/bar/

B.

$ distCp cluster1:/home/foo cluster2:/home/bar/

C.

$ Hadoop distCp cluster1:/home/foo cluster2:/home/bar/

D.

$ hadoop distCp hdfs://cluster1/home/foo hdfs://cluster2/home/bar/

 

Answer: D

Explanation: The most common invocation of DistCp is an inter-cluster copy:

 

bash$ hadoop distcp hdfs://nn1:8020/foo/bar \

hdfs://nn2:8020/bar/foo

 

This will expand the namespace under /foo/bar on nn1 into a temporary file, partition its contents among a set of map tasks, and start a copy on each TaskTracker from nn1 to nn2.

Note that DistCp expects absolute paths.

 

Note:

* DistCp (distributed copy) is a tool used for large inter/intra-cluster copying.

 

 

QUESTION 49

You’ve configured your cluster with HDFS Federation. One NameNode manages the /data namesapace and another Name/Node manages the /reports namespace. How do you configure a client machine to access both the /data and the /reports directories on the cluster?

 

A.

Configure the client to mount the /data namespace. As long as a single namespace is mounted and the client participates in the cluster, HDFS grants access to all files in the cluster to that client.

B.

Configure the client to mount both namespaces by specifying the appropriate properties in the core-site.xml

C.

You cannot configure a client to access both directories in the current implementation of HDFS Federation.

D.

You don’t need to configure any parameters on the client machine. Access is controlled by the NameNodes managing the namespace.

 

 

 

 

 

Answer: B

Explanation:

Note: HDFS Federation improves the existing HDFS architecture through a clear separation of namespace and storage, enabling generic block storage layer. It enables support for multiple namespaces in the cluster to improve scalability and isolation. Federation also opens up the architecture, expanding the applicability of HDFS cluster to new implementations and use cases.

 

Reference: Hortonworks, An Introduction to HDFS Federation

 

 

QUESTION 50

You set mapred.tasktracker.reduce.tasks.maximum to a value of 4 on a cluster running mapreduce v1 (MRv1). How many reducers will run for any given job?

 

A.

Maximum of 4 reducers, but the actual number of reducers that run for any given job is based on the volume of input data

B.

a maximum of 4 reducer’s, but the actual number of reducer’s that run for any given job is based on the volume intermediate data.

C.

Four reducer’s will run. Once set by the cluster administrator, this parameter can’t be overridden

D.

The number of reducer’s for any given job is set by the developer

 

Answer: B

Explanation: mapred.tasktracker.reduce.tasks.maximum The maximum number of reduce tasks that will be run simultaneously by a task tracker.

 

A reducer uses intermediate data as input.

 

Note:

* The mapper output (intermediate data) is stored on the Local file system (NOT HDFS) of each individual mapper nodes.

* Reducers start copying intermediate key-value pairs from the mappers as soon as they are available. The progress calculation also takes in account the processing of data transfer which is done by reduce process, therefore the reduce progress starts showing up as soon as any intermediate key-value pair for a mapper is available to be transferred to reducer.

 

Free VCE & PDF File for Cloudera CCA-410 Real Exam

Instant Access to Free VCE Files: CompTIA | VMware | SAP …
Instant Access to Free PDF Files: CompTIA | VMware | SAP …

 >=”cursor: auto; margin: 0cm 0cm 0pt; line-height: normal; text-autospace: ; mso-layout-grid-align: none” align=”left”> 

QUESTION 41

What metadata is stored on a DataNode when a block is written to it?

 

A.

None. Only the block itself is written.

B.

Checksums for the data in the block, as a separate file.

C.

Information on the file’s location in HDFS.

D.

Node location of each block belonging to the same namespace.

 

Answer: B

Reference: Protecting per-DataNode Metadata

 

 

QUESTION 42

What happens if a Mapper on one node goes into an infinite loop while running a MapReduce job?

 

A.

After a period of time, the JobTracker will restart the TaskTracker on the node on which the map task is running

B.

The Mapper will run indefinitely; the TaskTracker must be restarted to kill it

C.

The job will immediately fail.

D.

After a period of time, the TaskTracker will kill the Map Task.

 

Answer: D

Explanation: * The TaskTracker nodes are monitored. If they do not submit heartbeat signals often enough, they are deemed to have failed and the work is scheduled on a different TaskTracker.

* A TaskTracker will notify the JobTracker when a task fails. The JobTracker decides what to do then: it may resubmit the job elsewhere, it may mark that specific record as something to avoid, and it may may even blacklist the TaskTracker as unreliable.

 

 

QUESTION 43

 

Your cluster implements HDFS High Availability (HA). You two NameNodes are named nn01 and nn02. What occurs when you execute the command:

 

Hdfs haadmin -failover nn01 nn02

 

A.

nn02 becomes the standby NameNode and nn02 becomes the active NameNode

B.

Nn01 is fenced, and nn01 becomes the active NameNode

C.

Nn01 is fenced, and nn02 becomes the active NameNode

D.

Nn01 becomes the standby NameNode and nn02 becomes the active NameNode

 

Answer: C

Explanation: failover – initiate a failover between two NameNodes

 

This subcommand causes a failover from the first provided NameNode to the second. If the first NameNode is in the Standby state, this command simply transitions the second to the Active state without error. If the first NameNode is in the Active state, an attempt will be made to gracefully transition it to the Standby state. If this fails, the fencing methods (as configured by dfs.ha.fencing.methods) will be attempted in order until one of the methods succeeds. Only after this process will the second NameNode be transitioned to the Active state. If no fencing method succeeds, the second NameNode will not be transitioned to the Active state, and an error will be returned.

 

Reference: HDFS High Availability Administration, HA Administration using the haadmin command

 

 

QUESTION 44

Under which scenario would it be most appropriate to consider using faster (e.g 10 Gigabit) Ethernet as the network fabric for your Hadoop cluster?

 

A.

When the typical workloads generates a large amount of intermediate data, on the order of the input data itself.

B.

When the typical workloads consists of processor-intensive tasks.

C.

When the typical workloads consumes a large amount of input data, relative to the entire capacity of HDFS.

D.

When the typical workloads generates a large amount of output data, significantly larger than the amount of intermediate data.

 

Answer: A

 

 

Explanation: When we encounter applications that produce large amounts of intermediate datan the order of the same amount as is read ine recommend two ports on a single Ethernet card or two channel-bonded Ethernet cards to provide 2 Gbps per machine. Alternatively for customers who have already moved to 10 Gigabit Ethernet or Infiniband, these solutions can be used to address network bound workloads. Be sure that your operating system and BIOS are compatible if you’re considering switching to 10 Gigabit Ethernet.

 

Reference: Cloudera’s Support Team Shares Some Basic Hardware Recommendations

 

 

QUESTION 45

For a MapReduce job, on a cluster running MapReduce v1 (MRv1), what’s the relationship between tasks and task templates?

 

A.

There are always at least as many task attempts as there are tasks.

B.

There are always at most as many tasks attempts as there are tasks.

C.

There are always exactly as many task attempts as there are tasks.

D.

The developer sets the number of task attempts on job submission.

 

Answer: A

 

 

QUESTION 46

You have cluster running with the FIFO Scheduler enabled. You submit a large job A to the cluster, which you expect to run for one hour. Then, you submit job B to cluster, which you expect to run a couple of minutes only.

 

You submit both jobs with the same priority.

 

Which two best describes how the FIFO Scheduler arbitrates the cluster resources for a job and its tasks?

 

A.

Given Jobs A and B submitted in that order, all tasks from job A are guaranteed to finish before all tasks from job B.

B.

The order of execution of tasks within a job may vary.

 

 

 

 

C.

Tasks are scheduled in the order of their jobs’ submission.

D.

The FIFO Scheduler will give, on average, equal share of the cluster resources over the job lifecycle.

E.

Because there is more then a single job on the cluster, the FIFO Scheduler will enforce a limit on the percentage of resources allocated to a particular job at any given time.

F.

The FIFO Schedule will pass an exception back to the client when Job B is submitted, since all slots on the cluster are in use.

 

Answer: BC

 

 

QUESTION 47

Which command does Hadoop offer to discover missing or corrupt HDFS data?

 

A.

The map-only checksum utility,

B.

Fsck

C.

Du

D.

Dskchk

E.

Hadoop does not provide any tools to discover missing or corrupt data; there is no need because three replicas are kept for each data block.

 

Answer: B

Explanation: HDFS supports fsck command to check for various inconsistencies. It it is designed for reporting problems with various files, for e.g. missing blocks for a file or under replicated blocks. Unlike a traditional fsck utility for native filesystems, this command does not correct the errors it detects. Normally Namenode automatically corrects most of the recoverable failures. HDFS’ fsck is not a Hadoop shell command. It can be run as ‘bin/hadoop fsck’. Fsck can be run on the whole filesystem or on a subset of files.

 

Reference: Hadoop DFS User Guide

 

 

QUESTION 48

You are running two Hadoop clusters(cluster1 and cluster2), they run identical versions of hadoop.

 

You want to copy the data inside /home/foo/cluster1 to cluster2 into the directory /home/bar/

 

 

 

 

What is the correct distcp syntax to copy one directory tree from one cluster to the other cluster?

 

A.

$ distCp hdfs://cluster1/home/foo hdfs://cluster2/home/bar/

B.

$ distCp cluster1:/home/foo cluster2:/home/bar/

C.

$ Hadoop distCp cluster1:/home/foo cluster2:/home/bar/

D.

$ hadoop distCp hdfs://cluster1/home/foo hdfs://cluster2/home/bar/

 

Answer: D

Explanation: The most common invocation of DistCp is an inter-cluster copy:

 

bash$ hadoop distcp hdfs://nn1:8020/foo/bar \

hdfs://nn2:8020/bar/foo

 

This will expand the namespace under /foo/bar on nn1 into a temporary file, partition its contents among a set of map tasks, and start a copy on each TaskTracker from nn1 to nn2.

Note that DistCp expects absolute paths.

 

Note:

* DistCp (distributed copy) is a tool used for large inter/intra-cluster copying.

 

 

QUESTION 49

You’ve configured your cluster with HDFS Federation. One NameNode manages the /data namesapace and another Name/Node manages the /reports namespace. How do you configure a client machine to access both the /data and the /reports directories on the cluster?

 

A.

Configure the client to mount the /data namespace. As long as a single namespace is mounted and the client participates in the cluster, HDFS grants access to all files in the cluster to that client.

B.

Configure the client to mount both namespaces by specifying the appropriate properties in the core-site.xml

C.

You cannot configure a client to access both directories in the current implementation of HDFS Federation.

D.

You don’t need to configure any parameters on the client machine. Access is controlled by the NameNodes managing the namespace.

 

 

 

 

 

Answer: B

Explanation:

Note: HDFS Federation improves the existing HDFS architecture through a clear separation of namespace and storage, enabling generic block storage layer. It enables support for multiple namespaces in the cluster to improve scalability and isolation. Federation also opens up the architecture, expanding the applicability of HDFS cluster to new implementations and use cases.

 

Reference: Hortonworks, An Introduction to HDFS Federation

 

 

QUESTION 50

You set mapred.tasktracker.reduce.tasks.maximum to a value of 4 on a cluster running mapreduce v1 (MRv1). How many reducers will run for any given job?

 

A.

Maximum of 4 reducers, but the actual number of reducers that run for any given job is based on the volume of input data

B.

a maximum of 4 reducer’s, but the actual number of reducer’s that run for any given job is based on the volume intermediate data.

C.

Four reducer’s will run. Once set by the cluster administrator, this parameter can’t be overridden

D.

The number of reducer’s for any given job is set by the developer

 

Answer: B

Explanation: mapred.tasktracker.reduce.tasks.maximum The maximum number of reduce tasks that will be run simultaneously by a task tracker.

 

A reducer uses intermediate data as input.

 

Note:

* The mapper output (intermediate data) is stored on the Local file system (NOT HDFS) of each individual mapper nodes.

* Reducers start copying intermediate key-value pairs from the mappers as soon as they are available. The progress calculation also takes in account the processing of data transfer which is done by reduce process, therefore the reduce progress starts showing up as soon as any intermediate key-value pair for a mapper is available to be transferred to reducer.

 

Free VCE & PDF File for Cloudera CCA-410 Real Exam

Instant Access to Free VCE Files: CompTIA | VMware | SAP …
Instant Access to Free PDF Files: CompTIA | VMware | SAP …