Latest Certified Success Dumps Download

CISCO, MICROSOFT, COMPTIA, HP, IBM, ORACLE, VMWARE
CCA-470 Examination questions (September)

Achieve New Updated (September) Cloudera CCA-470 Examination Questions 51-60

September 24, 2015

Ensurepass

 

QUESTION 51

Under which scenario would it be most appropriate to consider using faster (e.g 10 Gigabit) Ethernet as the network fabric for your Hadoop cluster?

 

A.

When the typical workloads generates a large amount of intermediate data, on the order of the input data itself.

B.

When the typical workloads consists of processor-intensive tasks.

C.

When the typical workloads consumes a large amount of input data, relative to the entire capacity of HDFS.

D.

When the typical workloads generates a large amount of output data, significantly larger than the amount of intermediate data.

 

Answer: A

Explanation: When we encounter applications that produce large amounts of intermediate datan the order of the same amount as is read ine recommend two ports on a single Ethernet card or two channel-bonded Ethernet cards to provide 2 Gbps per machine. Alternatively for customers who have already moved to 10 Gigabit Ethernet or Infiniband, these solutions can be used to address network bound workloads.Be sure that your operating system and BIOS are compatible if you’re considering switching to 10 Gigabit Ethernet.

 

Reference: Cloudera’s Support Team Shares Some Basic Hardware Recommendations

 

 

QUESTION 52

 

What’s the relationship between JobTrackers and TaskTrackers?

 

A.

The JobTracker runs on a single master node and accepts MapReduce jobs from clients. A TaskTracker runs on every slave node and is responsible for managing actual map and reduce tasks.

B.

Every node in the cluster runs both a JobTracker and a TaskTracker. The JobTrackers manage jobs, and the TaskTrackers are responsible for managing actual map and reduce tasks.

C.

The TaskTrackers runs on a single master node and accepts MapReduce jobs from clients. A JobTracker runs on every slave node and is responsible for managing map and reduce tasks.

D.

The JobTracker runs on a single master node, but forks a separate instance of itself for every client MapReduce job. A TaskTracker runs on every slave node and is responsible for managing actual map and reduce tasks.

 

Answer: A

Reference: http://hadoop.apache.org/mapreduce/docs/r0.22.0/mapred_tutorial.html (Overview, 4th paragraph)

 

 

QUESTION 53

Your developers request that you enable them to use Hive on your Hadoop cluster. What do install and/or configure?

 

A.

Install the Hive interpreter on the client machines only, and configure a shared remote Hive Metastore.

B.

Install the Hive Interpreter on the client machines and all the slave nodes, and configure a shared remote Hive Metastore.

C.

Install the Hive interpreter on the master node running the JobTracker, and configure a shared remote Hive Metastore.

D.

Install the Hive interpreter on the client machines and all nodes on the cluster

 

Answer: A

Explanation: The Hive Interpreter runs on a client machine.

 

 

QUESTION 54

 

For a MapReduce job, on a cluster running MapReduce v1 (MRv1), what’s the relationship between tasks and task templates?

 

A.

There are always at least as many task attempts as there are tasks.

B.

There are always at most as many tasks attempts as there are tasks.

C.

There are always exactly as many task attempts as there are tasks.

D.

The developer sets the number of task attempts on job submission.

 

Answer: C

 

 

QUESTION 55

Which of the following statements arc accurate in describing features of Hadoop rack awareness? (Choose 2)

 

A.

HDFS is rack aware but MapReduce daemons are not.

B.

Rack location is considered in the HDFS block placement policy.

C.

Hadoop gives preference to intra-rack data transfer In order to conserve bandwidth.

D.

Even for small clusters on a single rack, configuring rack awareness will improve performance.

E.

Configuration of rack awareness is accomplished using a configuration file. You cannot use a rack topology script.

 

Answer: BC

 

 

QUESTION 56

What is the standard configuration of slave nodes in a Hadoop cluster?

 

A.

Each slave node either runs a TaskTracker or a DataNode daemon, but not both.

B.

Each slave node runs a JobTracker and a DataNode daemon.

C.

Each slave node runs a TaskTracker and a DataNode daemon.

D.

Each slave node runs a DataNode daemon, but only a fraction of the slave nodes run TaskTrackers.

E.

Each slave node runs a TaskTracker, but only a fraction of the slave nodes run DataNode daemons.

 

Answer: C

Reference: http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the- network/ (second paragraph on the page)

 

 

 

 

 

 

QUESTION 57

Your cluster is running Map v1 (MRv1), with default replication set to 3, and a cluster blocks 64MB. Identify which best describes the file read process when a Client application connects into the cluster and requests a 50MB file?

 

A.

The client queries the NameNode for the locations of the block, and reads all three copies. The first copy to complete transfer to the client is the one the client reads as part of Hadoop’s execution framework.

B.

The client queries the NameNode for the locations of the block, and reads from the first location in the list of receives.

C.

The client queries the NameNode for the locations of the block, and reads from a random location in the list it receives to eliminate network I/O loads by balancing which nodes it retrieves data from at any given time.

D.

The client queries the NameNode and then retrieves the block from the nearest DataNode to the client and then passes that block back to the client.

 

Answer: D

 

 

QUESTION 58

What is the Secondary NameNode?

 

A.

An alternate data channel for clients to reach HDFS, should the NameNode become too busy.

B.

A process that performs a checkpoint operation on the files produced by the NameNode.

C.

A data channel between the primary name node and the tertiary NameNode.

D.

A process purely intended to perform backups of the NameNode.

E.

A standby NameNode, for high availability.

 

Answer: B

Reference:http://wiki.apache.org/hadoop/FAQ#What_is_the_purpose_of_the_secondary_n ame-node.3F(3.2)

 

 

QUESTION 59

 

You have a cluster running with the Fair Scheduler enabled and configured. You submit multiple jobs to cluster. Each job is assigned to a pool. How are jobs scheduled? (Choose

2)

 

A.

Each pool’s share of task slots may change throughout the course of job execution.

B.

Pools get a dynamically-allocated share of the available task slots (subject to additional constraints).

C.

Each pool gets 1/M of the total available task slots, where M is the number of nodes in the cluster

D.

Pools are assigned priorities. Pools with higher priorities are executed before pools with lower priorities.

E.

Each pool gets 1/N of the total available task slots, where N is the number of jobs running on the cluster.

F.

Each pool’s share of task slots remains static within the execution of any individual job.

 

Answer: BD

 

 

QUESTION 60

Identify four characteristics of a 300MB file that has been written to HDFS with block size of 128MB and all other Hadoop defaults unchanged?

 

A.

The file will consume 1152MB of space in the cluster

B.

The third block will be 64MB

C.

The third Initial block will be 44 MB

D.

Two of the initial blocks will be 128MB

E.

Each block will be replicated three times

F.

The file will be split into three blocks when initially written into the cluster

G.

Each block will be replicated nine times

H.

All three blocks will be 128MB

 

Answer: CDEF

Explanation: Not A: The file will take (2×128 + 44) * 3 = 900 MB C (not B): The third block size is 300 ?2 * 128 = 44 MB D (Not H): all blocks in a file except the last block are the same size. E (not G): All blocks are replicated three times by default.

 

Free VCE & PDF File for Cloudera CCA-470 Real Exam

Instant Access to Free VCE Files: CompTIA | VMware | SAP …
Instant Access to Free PDF Files: CompTIA | VMware | SAP …

 >=”cursor: auto; margin: 0cm 0cm 0pt; line-height: normal; text-autospace: ; mso-layout-grid-align: none” align=”left”> 

QUESTION 51

Under which scenario would it be most appropriate to consider using faster (e.g 10 Gigabit) Ethernet as the network fabric for your Hadoop cluster?

 

A.

When the typical workloads generates a large amount of intermediate data, on the order of the input data itself.

B.

When the typical workloads consists of processor-intensive tasks.

C.

When the typical workloads consumes a large amount of input data, relative to the entire capacity of HDFS.

D.

When the typical workloads generates a large amount of output data, significantly larger than the amount of intermediate data.

 

Answer: A

Explanation: When we encounter applications that produce large amounts of intermediate datan the order of the same amount as is read ine recommend two ports on a single Ethernet card or two channel-bonded Ethernet cards to provide 2 Gbps per machine. Alternatively for customers who have already moved to 10 Gigabit Ethernet or Infiniband, these solutions can be used to address network bound workloads.Be sure that your operating system and BIOS are compatible if you’re considering switching to 10 Gigabit Ethernet.

 

Reference: Cloudera’s Support Team Shares Some Basic Hardware Recommendations

 

 

QUESTION 52

 

What’s the relationship between JobTrackers and TaskTrackers?

 

A.

The JobTracker runs on a single master node and accepts MapReduce jobs from clients. A TaskTracker runs on every slave node and is responsible for managing actual map and reduce tasks.

B.

Every node in the cluster runs both a JobTracker and a TaskTracker. The JobTrackers manage jobs, and the TaskTrackers are responsible for managing actual map and reduce tasks.

C.

The TaskTrackers runs on a single master node and accepts MapReduce jobs from clients. A JobTracker runs on every slave node and is responsible for managing map and reduce tasks.

D.

The JobTracker runs on a single master node, but forks a separate instance of itself for every client MapReduce job. A TaskTracker runs on every slave node and is responsible for managing actual map and reduce tasks.

 

Answer: A

Reference: http://hadoop.apache.org/mapreduce/docs/r0.22.0/mapred_tutorial.html (Overview, 4th paragraph)

 

 

QUESTION 53

Your developers request that you enable them to use Hive on your Hadoop cluster. What do install and/or configure?

 

A.

Install the Hive interpreter on the client machines only, and configure a shared remote Hive Metastore.

B.

Install the Hive Interpreter on the client machines and all the slave nodes, and configure a shared remote Hive Metastore.

C.

Install the Hive interpreter on the master node running the JobTracker, and configure a shared remote Hive Metastore.

D.

Install the Hive interpreter on the client machines and all nodes on the cluster

 

Answer: A

Explanation: The Hive Interpreter runs on a client machine.

 

 

QUESTION 54

 

For a MapReduce job, on a cluster running MapReduce v1 (MRv1), what’s the relationship between tasks and task templates?

 

A.

There are always at least as many task attempts as there are tasks.

B.

There are always at most as many tasks attempts as there are tasks.

C.

There are always exactly as many task attempts as there are tasks.

D.

The developer sets the number of task attempts on job submission.

 

Answer: C

 

 

QUESTION 55

Which of the following statements arc accurate in describing features of Hadoop rack awareness? (Choose 2)

 

A.

HDFS is rack aware but MapReduce daemons are not.

B.

Rack location is considered in the HDFS block placement policy.

C.

Hadoop gives preference to intra-rack data transfer In order to conserve bandwidth.

D.

Even for small clusters on a single rack, configuring rack awareness will improve performance.

E.

Configuration of rack awareness is accomplished using a configuration file. You cannot use a rack topology script.

 

Answer: BC

 

 

QUESTION 56

What is the standard configuration of slave nodes in a Hadoop cluster?

 

A.

Each slave node either runs a TaskTracker or a DataNode daemon, but not both.

B.

Each slave node runs a JobTracker and a DataNode daemon.

C.

Each slave node runs a TaskTracker and a DataNode daemon.

D.

Each slave node runs a DataNode daemon, but only a fraction of the slave nodes run TaskTrackers.

E.

Each slave node runs a TaskTracker, but only a fraction of the slave nodes run DataNode daemons.

 

Answer: C

Reference: http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the- network/ (second paragraph on the page)

 

 

 

 

 

 

QUESTION 57

Your cluster is running Map v1 (MRv1), with default replication set to 3, and a cluster blocks 64MB. Identify which best describes the file read process when a Client application connects into the cluster and requests a 50MB file?

 

A.

The client queries the NameNode for the locations of the block, and reads all three copies. The first copy to complete transfer to the client is the one the client reads as part of Hadoop’s execution framework.

B.

The client queries the NameNode for the locations of the block, and reads from the first location in the list of receives.

C.

The client queries the NameNode for the locations of the block, and reads from a random location in the list it receives to eliminate network I/O loads by balancing which nodes it retrieves data from at any given time.

D.

The client queries the NameNode and then retrieves the block from the nearest DataNode to the client and then passes that block back to the client.

 

Answer: D

 

 

QUESTION 58

What is the Secondary NameNode?

 

A.

An alternate data channel for clients to reach HDFS, should the NameNode become too busy.

B.

A process that performs a checkpoint operation on the files produced by the NameNode.

C.

A data channel between the primary name node and the tertiary NameNode.

D.

A process purely intended to perform backups of the NameNode.

E.

A standby NameNode, for high availability.

 

Answer: B

Reference:http://wiki.apache.org/hadoop/FAQ#What_is_the_purpose_of_the_secondary_n ame-node.3F(3.2)

 

 

QUESTION 59

 

You have a cluster running with the Fair Scheduler enabled and configured. You submit multiple jobs to cluster. Each job is assigned to a pool. How are jobs scheduled? (Choose

2)

 

A.

Each pool’s share of task slots may change throughout the course of job execution.

B.

Pools get a dynamically-allocated share of the available task slots (subject to additional constraints).

C.

Each pool gets 1/M of the total available task slots, where M is the number of nodes in the cluster

D.

Pools are assigned priorities. Pools with higher priorities are executed before pools with lower priorities.

E.

Each pool gets 1/N of the total available task slots, where N is the number of jobs running on the cluster.

F.

Each pool’s share of task slots remains static within the execution of any individual job.

 

Answer: BD

 

 

QUESTION 60

Identify four characteristics of a 300MB file that has been written to HDFS with block size of 128MB and all other Hadoop defaults unchanged?

 

A.

The file will consume 1152MB of space in the cluster

B.

The third block will be 64MB

C.

The third Initial block will be 44 MB

D.

Two of the initial blocks will be 128MB

E.

Each block will be replicated three times

F.

The file will be split into three blocks when initially written into the cluster

G.

Each block will be replicated nine times

H.

All three blocks will be 128MB

 

Answer: CDEF

Explanation: Not A: The file will take (2×128 + 44) * 3 = 900 MB C (not B): The third block size is 300 ?2 * 128 = 44 MB D (Not H): all blocks in a file except the last block are the same size. E (not G): All blocks are replicated three times by default.

 

Free VCE & PDF File for Cloudera CCA-470 Real Exam

Instant Access to Free VCE Files: CompTIA | VMware | SAP …
Instant Access to Free PDF Files: CompTIA | VMware | SAP …