Latest Certified Success Dumps Download

CISCO, MICROSOFT, COMPTIA, HP, IBM, ORACLE, VMWARE
CCB-400 Examination questions (September)

Achieve New Updated (September) Cloudera CCB-400 Examination Questions 11-20

September 24, 2015

Ensurepass

 

QUESTION 11

Given the following HBase dataset, which is labeled with row numbers. . .

 

clip_image001

 

Which of the following lists of row numbers is the correct order that HBase would store this data?

 

A.

1, 5, 2, 4, 3, 6

B.

4, 1, 2, 6, 3, 5

C.

4, 6, 3, 1, 5, 2

D.

3, 4, 6, 1, 2, 5

 

Answer: C

 

 

QUESTION 12

Given the following dataset:

 

clip_image002

 

 

 

 

How many store files will be contained in your region(s) immediately following a major compaction?

 

A.

Four

B.

Three

C.

Two

D.

One

 

Answer: C

Explanation: There are two columns families (Managers and Skills) so there will be two files.

 

Note:

* Physically, all column family members are stored together on the filesystem. Because tunings and storage specifications are done at the column family level, it is advised that all column family members have the same general access pattern and size characteristics.

 

* HBase currently does not do well with anything above two or three column families so keep the number of column families in your schema low. Currently, flushing and compactions are done on a per Region basis so if one column family is carrying the bulk of the data bringing on flushes, the adjacent families will also be flushed though the amount of data they carry is small. When many column families the flushing and compaction interaction can make for a bunch of needless i/o loading (To be addressed by changing flushing and compaction to work on a per column family basis).

* When changes are made to either Tables or ColumnFamilies (e.g., region size, block size), these changes take effect the next time there is a major compaction and the StoreFiles get re-written.

* StoreFiles are composed of blocks. The blocksize is configured on a per-ColumnFamily basis.

Compression happens at the block level within StoreFiles.

 

 

QUESTION 13

Your client connects to HBase for the first time to read a row user_1234 located in a table Users. What process does your client use to find the correct RegionServer to which it should send the request?

 

 

 

 

 

A.

The client looks up the location of ROOT, in which it looks up the location of META, in which it looks up the location of the correct Users region.

B.

The client looks up the location of the master, in which it looks up the location of META, in which it looks up the location of the correct Users region.

C.

The client looks up the location of ROOT in which it looks up the location of the correct Users region.

D.

The client queries the master to find the location of the Users table.

 

Answer: A

Explanation: *The general flow is that a new client contacts the Zookeeper quorum (a separate cluster of Zookeeper nodes) first to find a particular row key. It does so by retrieving the server name (i.e. host name) that hosts the -ROOT- region from Zookeeper. With that information it can query that server to get the server that hosts the .META. table. Both of these two details are cached and only looked up once. Lastly it can query the .META. server and retrieve the server that has the row the client is looking for. *The HBase client HTable is responsible for finding RegionServers that are serving the particular row range of interest. It does this by querying the .META. and -ROOT- catalog tables.After locating the required region(s), the client directly contacts the RegionServer serving that region (i.e., it does not go through the master) and issues the read or write request. This information is cached in the client so that subsequent requests need not go through the lookup process. Should a region be reassigned either by the master load balancer or because a RegionServer has died, the client will requery the catalog tables to determine the new location of the user region.

 

Reference:HBase Architecture 101 – Storage

 

 

QUESTION 14

Given that following is your entire dataset:

 

clip_image002[1]

 

How many sets of physical files will be read during a scan of the entire dataset immediately following a major compaction?

 

 

 

 

 

A.

Two

B.

One

C.

Three

D.

Four

 

Answer: A

Explanation: There are two columns families (Managers and Skills) so there will be two files.

 

Note:

* Physically, all column family members are stored together on the filesystem. Because tunings and storage specifications are done at the column family level, it is advised that all column family members have the same general access pattern and size characteristics.

 

* HBase currently does not do well with anything above two or three column families so keep the number of column families in your schema low. Currently, flushing and compactions are done on a per Region basis so if one column family is carrying the bulk of the data bringing on flushes, the adjacent families will also be flushed though the amount of data they carry is small. When many column families the flushing and compaction interaction can make for a bunch of needless i/o loading (To be addressed by changing flushing and compaction to work on a per column family basis).

* When changes are made to either Tables or ColumnFamilies (e.g., region size, block size), these changes take effect the next time there is a major compaction and the StoreFiles get re-written.

* StoreFiles are composed of blocks. The blocksize is configured on a per-ColumnFamily basis.

Compression happens at the block level within StoreFiles.

 

 

QUESTION 15

From within an HBase application, you would like to create a new table named weblogs.

You have started with the following Java code:

 

HBaseAdmin admin = new HBaseAdmin (conf);

 

HTableDescriptor t = new HTableDescriptor(“weblogs”);

 

Which of the following method(s) would you use next?

 

 

 

 

 

A.

admin.createTable(t);admin.enable.Table(t);

B.

admin.createTable(t);

C.

HTable.createTable(t);HTable.enableTable(t);

D.

HTable.createTable(t);

 

Answer: B

Explanation: See line 10 below.

 

Creating a table in HBase

public void createTable (String tablename, String familyname) throws IOException { Configuration conf = HBaseConfiguration.create(); HBaseAdmin admin = new HBaseAdmin(conf);

HTableDescriptor tabledescriptor = new HTableDescriptor(Bytes.toBytes(tablename)); tabledescriptor.addFamily(new HColumnDescriptor (familyname)); admin.createTable(tabledescriptor);

}

 

Reference:HBASE ADMINISTRATION USING THE JAVA API, USING CODE EXAMPLES

 

http://linuxjunkies.wordpress.com/2011/12/03/hbase-administration-using-the-java-api- using-code-examples/(creating a table in Hbase, see the code)

 

 

 

 

 

 

QUESTION 16

You have a total of three tables stored in HBase. Exchanging catalog regions, how many regions will your RegionServers have?

 

A.

Exactly three

B.

Exactly one

C.

At least one

D.

At least three

 

Answer: B

 

 

QUESTION 17

For a given Column Family, you want to always retain at least one version, but expire all other versions that are older than 5 days. Which of the following Column Family attribute settings would you set to do this?

 

A.

LENGTH = 5, MIN_VERSIONS = 1

B.

TTL = 5, MIN_VERSIONS = 1

C.

TTL = 432000, MIN_VERSIONS = 1

D.

TTL = 432000, VERSIONS =1

 

Answer: C

Explanation: * Time To Live (TTL)

ColumnFamilies can set a TTL length in seconds, and HBase will automatically delete rows once the expiration time is reached. This applies to all versions of a row – even the current one. The TTL time encoded in the HBase for the row is specified in UTC.

 

5 days is 43200 (5x24x60x60) seconds

 

* Minimum Number of Versions

Like maximum number of row versions, the minimum number of row versions to keep is configured per column family via HColumnDescriptor. The default for min versions is 0, which means the feature is disabled. The minimum number of row versions parameter is used together with the time-to-live parameter and can be combined with the number of row versions parameter to allow configurations such as “keep the last T minutes worth of data, at most N versions, but keep at least M versions around” (where M is the value for minimum number of row versions, M<N). This parameter should only be set when time-to-

 

 

 

 

 

live is enabled for a column family and must be less than the number of row versions.

 

Reference: HBase and Schema Design

 

 

QUESTION 18

You want to do mostly full table scans on your data. In order to improve performance you increase your block size. Why does this improve your scan performance?

 

A.

It does not. Increasing block size does not improve scan performance.

B.

It does not. Increasing block size means that fewer blocks fit into your block cache. This requires HBase to read each block from disk rather than cache for each scan, thereby decreasing scan performance.

C.

Increasing block size requires HBase to read from disk fewer times, thereby increasing scan performance.

D.

Increasing block size means fewer block indexes that need to be read from disk, thereby increasing scan performance.

 

Answer: D

Explanation: Change HFile block size to something bigger to improve scan (at cost of random read).

 

Reference:Testing HBase Scan performance

 

 

QUESTION 19

Under default settings, which feature of HBase ensures that data won’t be lost in the event of a RegionServer failure?

 

A.

All HBase activity is written to the WAL, which is stored in HDFS

B.

All operations are logged on the HMaster.

C.

HBase is ACID compliant, which guarantees that it is Durable.

D.

Data is stored on the local filesystem of the RegionServer.

 

 

 

 

 

Answer: A

Explanation: HBase data updates are stored in a place in memory called memstore for fast write. In the event of a region server failure, the contents of the memstore are lost because they have not been saved to disk yet. To prevent data loss in such a scenario, the updates are persisted in a WAL file before they are stored in the memstore. In the event of a region server failure, the lost contents in the memstore can be regenerated by replaying the updates (also called edits) from the WAL file.

 

Reference: HBase Log Splitting

 

http://tm.durusau.net/?p=27674(See `From the post’ second paragraph)

 

 

QUESTION 20

Given that the following is your entire dataset:

 

clip_image002[2]

 

How many regions will be read during a scan of the entire dataset?

 

A.

Four

B.

Two

C.

One

D.

Three

 

Answer: A

Free VCE & PDF File for Cloudera CCB-400 Real Exam

Instant Access to Free VCE Files: CompTIA | VMware | SAP …
Instant Access to Free PDF Files: CompTIA | VMware | SAP …