Day 7

Tutorial 6

bdat-tutorial-6-map-reduce-sample-response

HBase

A distributed column-oriented data store built on top of HDFS. It is a part of Hadoop ecosystem that provides random real-time read/write to data in the Hadoop File System.

HDFS (Write Once Read Many)	HBase
Not good for record lookup, only file lookup	Fast record lookup
Not good for incremental addition of small batches	Support for record-level insertion
Not good for updates	Support for updates

HBase Architecture

1. Hbase table, column family::column and cell.

_{^{source. https://www.edureka.co/blog/hbase-architecture/}}

2. When update happens, each cell with new version number.

^{_{source. https://community.cloudera.com/t5/Community-Articles/Hbase-security-model-part1/ta-p/248482}}

3. When the table grow too long, it splits. Regions == Partition. 1 region equal to 1 column family of table.

^{_{source. https://data-flair.training/blogs/hbase-architecture/}}

4. Region aka hfile is later stored in HDFS data node.

^{_{source. http://bigdatariding.blogspot.com/2013/12/hbase-architecture.html}}

HBase Storage Mechanism

Kim 2 ML

Day 7

Tutorial 6

HBase

HBase Architecture

HBase Storage Mechanism

HBase Components

Leave a comment Cancel reply

Day 7

Tutorial 6

HBase

HBase Architecture

HBase Storage Mechanism

HBase Components

Share this:

Leave a comment Cancel reply