Latest Certified Success Dumps Download

CISCO, MICROSOFT, COMPTIA, HP, IBM, ORACLE, VMWARE
CCA-410 Examination questions (September)

Achieve New Updated (September) Cloudera CCA-410 Examination Questions 1-10

September 24, 2015

Ensurepass

 


Exam A

 

QUESTION 1

Where does a MapReduce job store the intermediate data output from Mappers?

 

A.

On the underlying filesystem of the local disk machine on which the JobTracker ran.

B.

In HDFS, in the job’s output directory.

C.

In HDFS, in temporary directory defined mapred.tmp.dir.

D.

On the underlying filesystem of the local disk of the machine on which the Mapper ran.

E.

Stores on the underlying filesystem of the local disk of the machine on which the Reducer.

 

Answer: D

Explanation: The mapper output (intermediate data) is stored on the Local file system (NOT HDFS) of each individual mapper nodes. This is typically a temporary directory location which can be setup in config by the hadoop administrator. The intermediate data is cleaned up after the Hadoop Job completes.

 

Reference: 24 Interview Questions & Answers for Hadoop MapReduce developers , Where is the Mapper Output (intermediate kay-value data) stored ?

 

 

QUESTION 2

Which scheduler would you deploy to ensure that your cluster allows short jobs to finish within a reasonable time without starving long-running jobs?

 

A.

FIFO Scheduler

B.

Fair Scheduler

C.

Capacity Scheduler

D.

Completely Fair Scheduler (CFS)

 

Answer: B

Explanation: Fair scheduling is a method of assigning resources to jobs such that all jobs get, on average, an equal share of resources over time. When there is a single job running, that job uses the entire cluster. When other jobs are submitted, tasks slots that free up are assigned to the new jobs, so that each job gets roughly the same amount of CPU time. Unlike the default Hadoop scheduler, which forms a queue of jobs, this lets short jobs finish in reasonable time while not starving long jobs. It is also a reasonable way to share a cluster between a number of users. Finally, fair sharing can also work with job priorities –

 

 

 

 

 

the priorities are used as weights to determine the fraction of total compute time that each job should get.

 

Reference: Hadoop, Fair Scheduler Guide

 

 

QUESTION 3

Your existing Hadoop cluster has 30 slave nodes, each of which has 4 x 2T hard drives. You plan to add another 10 nodes. How much disk space can your new nodes contain?

 

A.

The new nodes must all contain 8TB of disk space, but it does not matter how the disks are configured

B.

The new nodes cannot contain more than 8TB of disk space

C.

The new nodes can contain any amount of disk space

D.

The new nodes must all contain 4 x 2TB hard drives

 

Answer: C

 

 

QUESTION 4

What is the rule governing the formatting of the underlying filesystem in the hadoop cluster?

 

A.

They must all use the same file system but this does not need to be the same filesystem as the filesystem used by the namenode

B.

they must all be left as formatted raw disk, hadoop format them automatically

C.

They must all use the same filesystem as the namenode

D.

They must all be left as unformatted, rawdisk;hadoop uses raw unformatted disk for HDFS

E.

They can use different file system

 

Answer: C

 

 

QUESTION 5

What action occurs automatically on a cluster when a DataNode is marked as dead?

 

 

 

 

 

A.

The NameNode forces re-replication of all the blocks which were stored on the dead DataNode.

B.

The next time a client submits job that requires blocks from the dead DataNode, the JobTracker receives no heart beats from the DataNode. The JobTracker tells the NameNode that the DataNode is dead, which triggers block re-replication on the cluster.

C.

The replication factor of the files which had blocks stored on the dead DataNode is temporarily reduced, until the dead DataNode is recovered and returned to the cluster.

D.

The NameNode informs the client which write the blocks that are no longer available; the client then re-writes the blocks to a different DataNode.

 

Answer: A

Explanation: How NameNode Handles data node failures?

 

NameNode periodically receives a Heartbeat and a Blockreport from each of the DataNodes in the cluster. Receipt of a Heartbeat implies that the DataNode is functioning properly. A Blockreport contains a list of all blocks on a DataNode. When NameNode notices that it has not recieved a hearbeat message from a data node after a certain amount of time, the data node is marked as dead. Since blocks will be under replicated the system begins replicating the blocks that were stored on the dead datanode. The NameNode Orchestrates the replication of data blocks from one datanode to another. The replication data transfer happens directly between datanodes and the data never passes through the namenode.

 

Note: If the Name Node stops receiving heartbeats from a Data Node it presumes it to be dead and any data it had to be gone as well. Based on the block reports it had been receiving from the dead node, the Name Node knows which copies of blocks died along with the node and can make the decision to re-replicate those blocks to other Data Nodes. It will also consult the Rack Awareness data in order to maintain thetwo copies in one rack, one copy in another rackreplica rule when deciding which Data Node should receive a new copy of the blocks.

 

Reference: 24 Interview Questions & Answers for Hadoop MapReduce developers, How NameNode Handles data node failures’

 

 

QUESTION 6

When requesting a file, how does HDFS retrieves the blocks associated with that file?

 

 

 

 

 

A.

the client polls the datanodes for the block ID’s

B.

the namenode requires the datanode for the block ID’s

C.

The namenode reads the block ID’s from memory

D.

the namenode reads the block ID’s from disk

 

Answer: D

Explanation: Here is how a client RPC request to the Hadoop HDFS NameNode flows through the NameNode.

 

The Hadoop NameNode receives requests from HDFS clients in the form of Hadoop RPC requests over a TCP connection. Typical client requests include mkdir, getBlockLocations, create file, etc. Remember ?HDFS separates metadata from actual file data, and that the NameNode is the metadata server. Hence, these requests are pure metadata requests ?no data transfer is involved.

 

 

QUESTION 7

Your cluster has 9 slave nodes. The cluster block size is set to 128 MB and it’s replication factor set to three.

 

How will the hadoop framework distribute block writes from a reducer into HDFS from a reducer outputting a 300MB file?

 

A.

Reducer don’t write blocks into HDFS

B.

The node on which the reducer is running will receive one copy of the each block. The other replicas will be placed on other nodes in the cluster

C.

The 9 blocks will be return randomly to the nodes; some may receive multiple blocks some may receive none

D.

All 9 nodes will each receive exactly one block

E.

The 9 blocks will be return to 3 nodes, such that each of the three get’s one copy of each block

 

Answer: C

Explanation: * If your replication is set to 3, it will be put on 3 separate nodes. The number of nodes it’s placed on is controlled by your replication factor. New blocks are placed almost randomly. There is some consideration for distribution across different racks (when hadoop is made aware of racks).

 

 

 

 

 

 

QUESTION 8

What two processes must you do if you are running a Hadoop cluster with a single NameNode and six DataNodes, and you want to change a configuration parameter so that it affects all six DataNodes.

 

A.

You must restart the NameNode daemon to apply the changes to the cluster

B.

You must restart all six DataNode daemons to apply the changes to the cluster.

C.

You don’t need to restart any daemon, as they will pick up changes automatically.

D.

You must modify the configuration files on each of the six DataNode machines.

E.

You must modify the configuration files on only one of the DataNode machine

F.

You must modify the configuration files on the NameNode only. DataNodes read their configuration from the master nodes.

 

Answer: BD

 

 

QUESTION 9

Each slave node in your cluster has four 2 TB hard drives installed (4x2TB). You set a value of the dfs.du.datanode.received parameter to 100GB on each slave node. How does this alter HDFS block storage?

 

A.

25 GB on each hard drive may not be used to store HDFS Blocks

B.

100 GB on each hard drive may not be used to store HDFS blocks

C.

All Hard drives may be used to store HDFS blocks as long as at least 100GB in total is available on the node

D.

a maximum of 100GB on each hard drive may be used to store HDFS blocks

 

Answer: B

Explanation: dfs.datanode.du.reserved

Default: 0

 

This many bytes will be left free on the volumes used by the DataNodes (see dfs.data.dir). As our drives are dedicated to Hadoop we left this at 0 but if the drives host other stuff as well set this to an appropriate value.

 

If you save other stuff than Hadoop data on the disks make sure to set dfs.datanode.du.reserved.

 

 

 

 

 

 

QUESTION 10

When planning a hadoop cluster, what general rule governs the hardware requirements between master nodes and slave nodes?

 

A.

The master nodes require more memory and greater disk capacity then the slave nodes

B.

The master and slave nodes should have the same hardware configuration

C.

The master nodes requires more memory and no disk drives

D.

The master nodes requires more memory but less disk capacity

E.

the master nodes requires less memory and fewer number disk drives than the slave nodes

 

Answer: D

Explanation: The master nodes handle metadata while the slave nodes handle the raw data.

Free VCE & PDF File for Cloudera CCA-410 Real Exam

Instant Access to Free VCE Files: CompTIA | VMware | SAP …
Instant Access to Free PDF Files: CompTIA | VMware | SAP …