Latest Certified Success Dumps Download

CISCO, MICROSOFT, COMPTIA, HP, IBM, ORACLE, VMWARE
CCD-333 Examination questions (September)

Achieve New Updated (September) Cloudera CCD-333 Examination Questions 21-30

September 24, 2015

Ensurepass

 

QUESTION 21

Which of the following best describes the map method input and output?

 

A.

It accepts a single key-value pair as input and can emit only one key-value pair as output.

B.

It accepts a list of key-value pairs as input hut run emit only one key value pair as output.

C.

It accepts a single key-value pair as input and emits a single key and list of corresponding values as output

D.

It accepts a single key-value pair as input and can emit any number of key-value pairs as output, including zero.

 

Answer: D

Explanation: public class Mapper<KEYIN,VALUEIN,KEYOUT,VALUEOUT> extends Object

Maps input key/value pairs to a set of intermediate key/value pairs.

 

Maps are the individual tasks which transform input records into a intermediate records. The transformed intermediate records need not be of the same type as the input records. A given input pair may map to zero or many output pairs.

 

Reference: org.apache.hadoop.mapreduce

 

 

 

 

Class Mapper<KEYIN,VALUEIN,KEYOUT,VALUEOUT>

 

 

QUESTION 22

What is a SequenceFile?

 

A.

A SequenceFile contains a binary encoding of an arbitrary number of homogeneous writable objects.

B.

A SequenceFile contains a binary encoding of an arbitrary number of heterogeneous writable objects.

C.

A SequenceFile contains a binary encoding of an arbitrary number of WritableComparable objects, in sorted order.

D.

A SequenceFile contains a binary encoding of an arbitrary number key-value pairs. Each key must be the same type. Each value must be same type.

 

Answer: D

Explanation: SequenceFile is a flat file consisting of binary key/value pairs.

 

There are 3 different SequenceFile formats:

 

Uncompressed key/value records.

Record compressed key/value records – only ‘values’ are compressed here. Block compressed key/value records – both keys and values are collected in ‘blocks’ separately and compressed. The size of the ‘block’ is configurable.

 

Reference:http://wiki.apache.org/hadoop/SequenceFile

 

 

QUESTION 23

You are developing a combiner that takes as input Text keys, IntWritable values, and emits Text keys, Intwritable values. Which interface should your class implement?

 

A.

Mapper <Text, IntWritable, Text, IntWritable>

B.

Reducer <Text, Text, IntWritable, IntWritable>

 

 

 

 

C.

Reducer <Text, IntWritable, Text, IntWritable>

D.

Combiner <Text, IntWritable, Text, IntWritable>

E.

Combiner <Text, Text, IntWritable, IntWritable>

 

Answer: D

 

 

QUESTION 24

Which happens if the NameNode crashes?

 

A.

HDFS becomes unavailable until the NameNode is restored.

B.

The Secondary NameNode seamlessly takes over and there is no service interruption.

C.

HDFS becomes unavailable to new MapReduce jobs, but running jobs will continue until completion.

D.

HDFS becomes temporarily unavailable until an administrator starts redirecting client requests to the Secondary NameNode.

 

Answer: A

Explanation: The NameNode is a Single Point of Failure for the HDFS Cluster. When the NameNode goes down, the file system goes offline.

 

Reference:24 Interview Questions & Answers for Hadoop MapReduce developers,What is a NameNode? How many instances of NameNode run on a Hadoop Cluster?

 

 

QUESTION 25

Your client application submits a MapReduce job to your Hadoop cluster. The Hadoop framework looks for an available slot to schedule the MapReduce operations on which of the following Hadoop computing daemons?

 

A.

DataNode

B.

NameNode

C.

JobTracker

D.

TaskTracker

E.

Secondary NameNode

 

Answer: C

 

 

Explanation: JobTracker is the daemon service for submitting and tracking MapReduce jobs in Hadoop. There is only One Job Tracker process run on any hadoop cluster. Job Tracker runs on its own JVM process. In a typical production cluster its run on a separate machine. Each slave node is configured with job tracker node location. The JobTracker is single point of failure for the Hadoop MapReduce service. If it goes down, all running jobs are halted. JobTracker in Hadoop performs following actions(from Hadoop Wiki:)

 

Client applications submit jobs to the Job tracker. The JobTracker talks to the NameNode to determine the location of the data The JobTracker locates TaskTracker nodes with available slots at or near the data The JobTracker submits the work to the chosen TaskTracker nodes. The TaskTracker nodes are monitored. If they do not submit heartbeat signals often enough, they are deemed to have failed and the work is scheduled on a different TaskTracker.

A TaskTracker will notify the JobTracker when a task fails. The JobTracker decides what to do then: it may resubmit the job elsewhere, it may mark that specific record as something to avoid, and it may may even blacklist the TaskTracker as unreliable. When the work is completed, the JobTracker updates its status.

 

Client applications can poll the JobTracker for information.

 

Reference:24 Interview Questions & Answers for Hadoop MapReduce developers,What is a JobTracker in Hadoop? How many instances of JobTracker run on a Hadoop Cluster?

 

 

QUESTION 26

You need to create a GUI application to help your company’s sales people add and edit customer information. Would HDFS be appropriate for this customer information file?

 

A.

Yes, because HDFS is optimized for random access writes.

B.

Yes, because HDFS is optimized for fast retrieval of relatively small amounts of data.

C.

No, because HDFS can only be accessed by MapReduce applications.

D.

No, because HDFS is optimized for write-once, streaming access for relatively large files.

 

Answer: D

Explanation: HDFS is designed to support very large files. Applications that are

 

 

 

 

 

compatible with HDFS are those that deal with large data sets. These applications write their data only once but they read it one or more times and require these reads to be satisfied at streaming speeds. HDFS supports write-once-read-many semantics on files.

 

Reference:24 Interview Questions & Answers for Hadoop MapReduce developers,What is HDFS ? How it is different from traditional file systems?

 

 

QUESTION 27

Workflows expressed in Oozie can contain:

 

A.

Iterative repetition of MapReduce jobs until a desired answer or state is reached.

B.

Sequences of MapReduce and Pig jobs. These are limited to linear sequences of actions with exception handlers but no forks.

C.

Sequences of MapReduce jobs only; no Pig or Hive tasks or jobs. These MapReduce sequences can be combined with forks and path joins.

D.

Sequences of MapReduce and Pig. These sequences can be combined with other actions including forks, decision points, and path joins.

 

Answer: D

Reference:http://incubator.apache.org/oozie/docs/3.1.3/docs/WorkflowFunctionalSpec.html (workflow definition, first sentence)

 

 

QUESTION 28

For each intermediate key, each reducer task can emit:

 

A.

One final key value pair per key; no restrictions on the type.

B.

One final key-value pair per value associated with the key; no restrictions on the type.

C.

As many final key-value pairs as desired, as long as all the keys have the same type and all the values have the same type.

D.

As many final key-value pairs as desired, but they must have the same type as the intermediate key-value pairs.

E.

As many final key value pairs as desired. There are no restrictions on the types of those key-value pairs (i.e., they can be heterogeneous)

 

 

 

 

 

Answer: A

Explanation: Reducer reduces a set of intermediate values which share a key to a smaller set of values.

 

Reference:Hadoop Map-Reduce Tutorial

 

 

QUESTION 29

Given a directory of files with the following structure: line number, tab character, string:

 

Example:

 

1. abialkjfjkaoasdfjksdlkjhqweroij

 

2. kadf jhuwqounahagtnbvaswslmnbfgy

 

3. kjfteiomndscxeqalkzhtopedkfslkj

 

You want to send each line as one record to your Mapper. Which InputFormat would you use to complete the line: setInputFormat (________.class);

 

A.

BDBInputFormat

B.

KeyValueTextInputFormat

C.

SequenceFileInputFormat

D.

SequenceFileAsTextInputFormat

 

Answer: C

Explanation: Note:

The output format for your first MR job should be SequenceFileOutputFormat – this will store the Key/Values output from the reducer in a binary format, that can then be read back in, in your second MR job using SequenceFileInputFormat.

 

Reference:http://stackoverflow.com/questions/9721754/how-to-parse-customwritable-from- text-in-hadoop(see answer 1 and then see the comment #1 for it)

 

 

QUESTION 30

 

You need to create a job that does frequency analysis on input data. You will do this by writing a Mapper that uses TextInputForma and splits each value (a line of text from an input file) into individual characters. For each one of these characters, you will emit the character as a key and as IntWritable as the value. Since this will produce proportionally more intermediate data than input data, which resources could you expect to be likely bottlenecks?

 

A.

Processor and RAM

B.

Processor and disk I/O

C.

Disk I/O and network I/O

D.

Processor and network I/O

 

Answer: B

Free VCE & PDF File for Cloudera CCD-333 Real Exam

Instant Access to Free VCE Files: CompTIA | VMware | SAP …
Instant Access to Free PDF Files: CompTIA | VMware | SAP …