• Post author:
  • Post category:HBase
  • Post last modified:March 27, 2024
  • Reading time:5 mins read

To insert data into the HBase table use PUT command, this would be similar to insert statement on RDBMS but the syntax is completely different. In this article I will describe how to insert data into HBase table with examples using PUT command from the HBase shell.

Advertisements

HBase PUT to Insert Data into Table

Use PUT command to insert data to rows and columns on an HBase table. This would be similar to insert statement on RDBMS but, the syntax is completely different.

HBase PUT command syntax

Below is the syntax of PUT command which is used to insert data (rows and columns) into a HBase table.


put '<name_space:table_name>', '<row_key>' '<cf:column_name>', '<value>'

HBase PUT Examples

Below are some example of inserting data to HBase table emp.


hbase(main):060:0> put 'emp', '1' , 'office:name', 'Scott'
hbase(main):060:0> put 'emp', '2' , 'office:name', 'Mark'     
hbase(main):061:0> put 'emp', '2' , 'office:gender', 'M'     
hbase(main):062:0> put 'emp', '2' , 'office:age', '30'
hbase(main):063:0> put 'emp', '2' , 'office:age', '50'

In above examples, notice that we have added 2 rows; row key ‘1’ with one column ‘office:name’ and row key ‘2’ with three columns ‘office:name’, ‘office:gender’ and ‘office:age’. If you are coming from RDBMS world, you probably would confuse with this. Once you understand how column database works it’s not that difficult to get around it.

Also, note that last command from above example actually inserts a new column ‘office:age’ at row key ‘2’ with ’50’

Internally, HBase doesn’t do an update but it assigns a column with new timestamp and scan fetches the latest data from columns.


hbase(main):017:0> put 'emp', '3', 'office:salary', '10000'
Took 0.0359 seconds
hbase(main):018:0> put 'emp', '3', 'office:name', 'Jeff'
Took 0.0021 seconds
hbase(main):019:0> put 'emp', '3', 'office:salary', '20000'
Took 0.0032 seconds
hbase(main):020:0> put 'emp', '3', 'office:salary', '30000'
Took 0.0021 seconds
hbase(main):021:0> put 'emp', '3', 'office:salary', '40000'
Took 0.0025 seconds
hbase(main):027:0> put 'emp','1','office:age','20'
hbase(main):027:0> put 'emp','3','office:age','30'

Scan command is used to fetch the data from a table.

hbase put insert data

Let’s add a few more rows.


hbase> put 't1', 'r1', 'c1', 'value', ts1
hbase> put 't1', 'r1', 'c1', 'value', {ATTRIBUTES=>{'mykey'=>'myvalue'}}
hbase> put 't1', 'r1', 'c1', 'value', ts1, {ATTRIBUTES=>{'mykey'=>'myvalue'}}
hbase> put 't1', 'r1', 'c1', 'value', ts1, {VISIBILITY=>'PRIVATE|SECRET'}

Besides these, there are several options available on HBase put command. I will leave these to you to explore.

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium