• Post author:
  • Post category:HBase
  • Post last modified:March 27, 2024
  • Reading time:56 mins read
You are currently viewing HBase Shell Commands Cheat Sheet

HBase Shell commands are broken down into 13 groups to interact with HBase Database via HBase shell, let’s see usage, syntax, description, and examples of each in this article. From the below tables, the first table describes groups and all its commands in a cheat sheet and the remaining tables provide the detail description of each group and its commands.

HBase Shell Commands by Group

On the below table click on links to check usage, description, and examples for each HBase shell group or commands. You can also get the usage of each by running help ‘<command>’ | ‘<group-name>’ or just entering command name without parameters on the HBase shell.

If you do not have HBase setup and running on your system, I would recommend to have the setup and start using the Hbase shell.

While trying these commands, make sure table names, rows, columns all enclosed in quote characters.

COMMANDSUSAGE & EXAMPLES
list_quotasYou can filter the result based on USER, TABLE, or NAMESPACE.

For example:

hbase> list_quotas
hbase> list_quotas USER => ‘bob.*’
hbase> list_quotas USER =List the quota settings added to the system.
> ‘bob.*’, TABLE => ‘t1’
hbase> list_quotas USER => ‘bob.*’, NAMESPACE => ‘ns.*’
hbase> list_quotas TABLE => ‘myTable’
hbase> list_quotas NAMESPACE => ‘ns.*’
set_quotaSyntax : set_quota TYPE => ,

TYPE => THROTTLE
User can either set quota on read, write or on both the requests together(i.e., read+write)
The read, write, or readSet a quota for a user, table, or namespace.
+write(default throttle type) request limit can be expressed using
the form 100req/sec, 100req/min and the read, write, read+write(default throttle type) limit
can be expressed using the form 100k/sec, 100M/min with (B, K, M, G, T, P) as valid size unit
and (sec, min, hour, day) as valid time unit.
Currently the throttle limit is per machine – a limit of 100req/min
means that each machine can execute 100req/min.

For example:

hbase> set_quota TYPE => THROTTLE, USER => ‘u1′, LIMIT => ’10req/sec’
hbase> set_quota TYPE => THROTTLE, THROTTLE_TYPE => READ, USER => ‘u1′, LIMIT => ’10req/sec’

hbase> set_quota TYPE => THROTTLE, USER => ‘u1′, LIMIT => ’10M/sec’
hbase> set_quota TYPE => THROTTLE, THROTTLE_TYPE => WRITE, USER => ‘u1′, LIMIT => ’10M/sec’

hbase> set_quota TYPE => THROTTLE, USER => ‘u1’, TABLE => ‘t2’, LIMIT => ‘5K/min’
hbase> set_quota TYPE => THROTTLE, USER => ‘u1’, NAMESPACE => ‘ns2’, LIMIT => NONE

hbase> set_quota TYPE => THROTTLE, NAMESPACE => ‘ns1′, LIMIT => ’10req/sec’
hbase> set_quota TYPE => THROTTLE, TABLE => ‘t1′, LIMIT => ’10M/sec’
hbase> set_quota TYPE => THROTTLE, THROTTLE_TYPE => WRITE, TABLE => ‘t1′, LIMIT => ’10M/sec’
hbase> set_quota TYPE => THROTTLE, USER => ‘u1’, LIMIT => NONE
hbase> set_quota TYPE => THROTTLE, THROTTLE_TYPE => WRITE, USER => ‘u1’, LIMIT => NONE

hbase> set_quota USER => ‘u1’, GLOBAL_BYPASS => true

HBase General Shell Commands

These shell commands are commonly used to identify the version, status, of the database.

COMMANDUSAGE & EXAMPLES
alter_namespaceAlter namespace properties.
To add/modify a property:
hbase> alter_namespace ‘ns1’, {METHOD => ‘set’, ‘PROPERTY_NAME’ => ‘PROPERTY_VALUE’}
To delete a property:
hbase> alter_namespace ‘ns1’, {METHOD => ‘unset’, NAME=>’PROPERTY_NAME’}
create_namespaceCreate namespace; pass namespace name,
and optionally a dictionary of namespace configuration.
Examples:
hbase> create_namespace ‘ns1’
hbase> create_namespace ‘ns1’, {‘PROPERTY_NAME’=>’PROPERTY_VALUE’}
describe_namespaceDescribe the named namespace. For example:
hbase> describe_namespace ‘ns1’
drop_namespaceDrop the named namespace. The namespace must be empty.
list_namespaceList all namespaces in hbase. Optional regular expression parameter could
be used to filter the output. Examples:
hbase> list_namespace
hbase> list_namespace ‘abc.*’
list_namespace_tablesList all tables that are members of the namespace.
Examples:
hbase> list_namespace_tables ‘ns1’

Data Manipulation Language (DML) Shell Commands

DML HBase shell commands include most commonly used commands to modify the data, for example, put – is used to insert the rows to the tables, get & scan – are used to retrieve the data, delete & truncate – are used to delete the data, append – is used to append the cells and there are many commands

GROUP NAMEHBASE SHELL COMMANDS
generalstatustable_helpversionwhoami
ddlalteralter_asyncalter_statuscreatedescribedisabledisable_alldropdrop_all,enable , enable_allexistsget_tableis_disabledis_enabledlistlocate_regionshow_filters
namespacealter_namespacecreate_namespacedescribe_namespacedrop_namespacelist_namespace,list_namespace_tables
dmlappendcountdeletedeleteallgetget_counterget_splitsincrputscantruncatetruncate_preserve
toolsassignbalance_switchbalancerbalancer_enabledcatalogjanitor_enabledcatalogjanitor_runcatalogjanitor_switchclose_regioncompactcompact_mobcompact_rsflushmajor_compactmajor_compact_mobmerge_regionmovenormalizer_enablednormalizenormalizer_switchsplittraceunassignwal_rollzk_dump
replicationadd_peerappend_peer_tableCFsdisable_peerdisable_table_replicationenable_peerenable_table_replicationget_peer_configlist_peer_configslist_peerslist_replicated_tablesremove_peerremove_peer_tableCFsset_peer_tableCFsshow_peer_tableCFsupdate_peer_config
snapshotsclone_snapshotdelete_all_snapshotdelete_snapshotlist_snapshotsrestore_snapshotsnapshot
configurationupdate_all_configupdate_config
quotaslist_quotasset_quota
securitygrantlist_security_capabilitiesrevokeuser_permission
proceduresabort_procedurelist_procedures
visibility labelsadd_labelsclear_authsget_authslist_labelsset_authsset_visibility
rsgroupadd_rsgroupbalance_rsgroupget_rsgroupget_server_rsgroupget_table_rsgrouplist_rsgroupsmove_servers_rsgroupmove_tables_rsgroupremove_rsgroup

Data Definition Language (DDL) Shell Commands

DDL HBase shell commands are another set of commands used mostly to change the structure of the table, for example, alter – is used to delete column family from a table or any alteration to the table. before you run alter make sure you disable the table first. create – is used to create a table, drop – to drop the table and many more.

PAIR RDD FUNCTIONSFUNCTION DESCRIPTION
aggregateByKeyAggregate the values of each key in a data set. This function can return a different result type then the values in input RDD.
combineByKeyCombines the elements for each key.
combineByKeyWithClassTagCombines the elements for each key.
flatMapValuesIt’s flatten the values of each key with out changing key values and keeps the original RDD partition.
foldByKeyMerges the values of each key.
groupByKeyReturns the grouped RDD by grouping the values of each key.
mapValuesIt applied a map function for each value in a pair RDD with out changing keys.
reduceByKeyReturns a merged RDD by merging the values of each key.
reduceByKeyLocallyReturns a merged RDD by merging the values of each key and final result will be sent to the master.
sampleByKeyReturns the subset of the RDD.
subtractByKeyReturn an RDD with the pairs from this whose keys are not in other.
keysReturns all keys of this RDD as a RDD[T].
valuesReturns an RDD with just values.
partitionByReturns a new RDD after applying specified partitioner.
fullOuterJoinReturn RDD after applying fullOuterJoin on current and parameter RDD
joinReturn RDD after applying join on current and parameter RDD
leftOuterJoinReturn RDD after applying leftOuterJoin on current and parameter RDD
rightOuterJoinReturn RDD after applying rightOuterJoin on current and parameter RDD

Namespace Commands Syntax and Usage

This group contains commands to alter & create the namespace of the HBase database.

NAMEPROCEDURE HBASE SHELL COMMANDS USAGE
abort_proceduren

If this command is accepted and the procedure is in the process of aborting,
it will rot be abortable. For experts only.default is true), abort a procedure in hbase. Use with caution. Some procedures
eGiven amight procedure Id (and optional boolean may_interrupt_if_running parameter,
turn true; if the procedure could not be aborted (eg. procedure
does not exist, or procedure already completed or abort will cause corruption),
this command will return false.

Examples:

hbase> abort_procedure proc_id
hbase> abort_procedure proc_id, true
hbase> abort_procedure proc_id, false
list_proceduresList all procedures in hbase. For example:

hbase> list_procedures

Tool Syntax and Usage

COMMANDSUSAGE & EXAMPLES
grantGrant users specific rights.

permissions is either zero or more letters from the set “RWXCA”.
READ(‘R’), WRITE(‘W’), EXEC(‘X’), CREATE(‘C’), ADMIN(‘A’)

Note: Groups and users are granted access in the same way, but groups are prefixed with an ‘@’
character. In the same way, tables and namespaces are specified, but namespaces are
prefixed with an ‘@’ character.

For example:

hbase> grant ‘bobsmith’, ‘RWXCA’
hbase> grant ‘@admins’, ‘RWXCA’
hbase> grant ‘bobsmith’, ‘RWXCA’, ‘@ns1’
hbase> grant ‘bobsmith’, ‘RW’, ‘t1’, ‘f1’, ‘col1’
hbase> grant ‘bobsmith’, ‘RW’, ‘ns1:t1’, ‘f1’, ‘col1’
list_security_capabilitiesList supported security capabilities

Example:
hbase> list_security_capabilities
revokeRevoke a user’s access rights.

Note: Groups and users access are revoked in the same way, but groups are prefixed with an ‘@’
character. In the same way, tables and namespaces are specified, but namespaces are
prefixed with an ‘@’ character.

For example:

hbase> revoke ‘bobsmith’
hbase> revoke ‘@admins’
hbase> revoke ‘bobsmith’, ‘@ns1’
hbase> revoke ‘bobsmith’, ‘t1’, ‘f1’, ‘col1’
hbase> revoke ‘bobsmith’, ‘ns1:t1’, ‘f1’, ‘col1’
user_permissionShow all permissions for the particular user.
Syntax : user_permission table

Note: A namespace must always precede with ‘@’ character.

For example:

hbase> user_permission
hbase> user_permission ‘@ns1’
hbase> user_permission ‘@.*’
hbase> user_permission ‘@^[a-c].*’
hbase> user_permission ‘table1’
hbase> user_permission ‘namespace1:table1’
hbase> user_permission ‘.*’
hbase> user_permission ‘^[A-C].*’

Replication Syntax and Usage

Note: In order to use these tools, hbase.replication must be set to true and commands in these groups are mainly used to add or remove a peer from an HBase cluster.

COMMANDSUSAGE & EXAMPLES
clone_snapshotCreate a new table by cloning the snapshot content.
There’re no copies of data involved.
And writing on the newly created table will not influence the snapshot data.

Examples:
hbase> clone_snapshot ‘snapshotName’, ‘tableName’
hbase> clone_snapshot ‘snapshotName’, ‘namespace:tableName’
delete_all_snapshotDelete all of the snapshots matching the given regex. Examples:

hbase> delete_all_snapshot ‘s.*’
delete_snapshotDelete a specified snapshot. Examples:

hbase> delete_snapshot ‘snapshotName’,
list_snapshotsList all snapshots taken (by printing the names and relative information).
Optional regular expression parameter could be used to filter the output
by snapshot name.

Examples:
hbase> list_snapshots
hbase> list_snapshots ‘abc.*’
restore_snapshotRestore a specified snapshot.
The restore will replace the content of the original table,
bringing back the content to the snapshot state.
The table must be disabled.

Examples:
hbase> restore_snapshot ‘snapshotName’
snapshotTake a snapshot of specified table. Examples:

hbase> snapshot ‘sourceTable’, ‘snapshotName’
hbase> snapshot ‘namespace:sourceTable’, ‘snapshotName’, {SKIP_FLUSH => true}

Snapshot Shell commands

This group of commands is used to take the snapshot of the database at any given time.

COMMANDUSAGE & EXAMPLES
alterIf the “hbase.online.schema.update.enable” property is set to
false, then the table must be disabled (see help ‘disable’). If the
“hbase.online.schema.update.enable” property is set to true, tables can be
altered without disabling them first. Altering enabled tables has caused problems
in the past, so use caution and test it before using in production.

You can use the alter command to add,
modify or delete column families or change table configuration options.
Column families work in a similar way as the ‘create’ command. The column family
specification can either be a name string, or a dictionary with the NAME attribute.
Dictionaries are described in the output of the ‘help’ command, with no arguments.

For example, to change or add the ‘f1’ column family in table ‘t1’ from
current value to keep a maximum of 5 cell VERSIONS, do:

hbase> alter ‘t1’, NAME => ‘f1’, VERSIONS => 5

You can operate on several column families:

hbase> alter ‘t1’, ‘f1’, {NAME => ‘f2’, IN_MEMORY => true}, {NAME => ‘f3’, VERSIONS => 5}

To delete the ‘f1’ column family in table ‘ns1:t1’, use one of:

hbase> alter ‘ns1:t1’, NAME => ‘f1’, METHOD => ‘delete’
hbase> alter ‘ns1:t1’, ‘delete’ => ‘f1’

You can also change table-scope attributes like MAX_FILESIZE, READONLY,
MEMSTORE_FLUSHSIZE, DURABILITY, etc. These can be put at the end;
for example, to change the max size of a region to 128MB, do:

hbase> alter ‘t1’, MAX_FILESIZE => ‘134217728’

You can add a table coprocessor by setting a table coprocessor attribute:

hbase> alter ‘t1’,
‘coprocessor’=>’hdfs:///foo.jar|com.foo.FooRegionObserver|1001|arg1=1,arg2=2’

Since you can have multiple coprocessors configured for a table, a
sequence number will be automatically appended to the attribute name
to uniquely identify it.

The coprocessor attribute must match the pattern below in order for
the framework to understand how to load the coprocessor classes:

[coprocessor jar file location] | class name | [priority] | [arguments]

You can also set configuration settings specific to this table or column family:

hbase> alter ‘t1’, CONFIGURATION => {‘hbase.hregion.scan.loadColumnFamiliesOnDemand’ => ‘true’}
hbase> alter ‘t1’, {NAME => ‘f2’, CONFIGURATION => {‘hbase.hstore.blockingStoreFiles’ => ’10’}}

You can also remove a table-scope attribute:

hbase> alter ‘t1’, METHOD => ‘table_att_unset’, NAME => ‘MAX_FILESIZE’

hbase> alter ‘t1’, METHOD => ‘table_att_unset’, NAME => ‘coprocessor$1’

You can also set REGION_REPLICATION:

hbase> alter ‘t1’, {REGION_REPLICATION => 2}

There could be more than one alteration in one command:

hbase> alter ‘t1’, { NAME => ‘f1’, VERSIONS => 3 },
{ MAX_FILESIZE => ‘134217728’ }, { METHOD => ‘delete’, NAME => ‘f2’ },
OWNER => ‘johndoe’, METADATA => { ‘mykey’ => ‘myvalue’ }
alter_asyncAlter column family schema, does not wait for all regions to receive the
schema changes. Pass table name and a dictionary specifying new column
family schema. Dictionaries are described on the main help command output.
Dictionary must include name of column family to alter. For example,

To change or add the ‘f1’ column family in table ‘t1’ from defaults
to instead keep a maximum of 5 cell VERSIONS, do:

hbase> alter_async ‘t1’, NAME => ‘f1’, VERSIONS => 5

To delete the ‘f1’ column family in table ‘ns1:t1’, do:

hbase> alter_async ‘ns1:t1’, NAME => ‘f1’, METHOD => ‘delete’

or a shorter version:

hbase> alter_async ‘ns1:t1’, ‘delete’ => ‘f1’

You can also change table-scope attributes like MAX_FILESIZE
MEMSTORE_FLUSHSIZE, READONLY, and DEFERRED_LOG_FLUSH.

For example, to change the max size of a family to 128MB, do:

hbase> alter ‘t1’, METHOD => ‘table_att’, MAX_FILESIZE => ‘134217728’

There could be more than one alteration in one command:

hbase> alter ‘t1’, {NAME => ‘f1’}, {NAME => ‘f2’, METHOD => ‘delete’}

To check if all the regions have been updated, use alter_status
alter_statusGet the status of the alter command. Indicates the number of regions of the
table that have received the updated schema
Pass table name.

hbase> alter_status ‘t1’
hbase> alter_status ‘ns1:t1’
createCreates a table. Pass a table name, and a set of column family
specifications (at least one), and, optionally, table configuration.
Column specification can be a simple string (name), or a dictionary
(dictionaries are described below in main help output), necessarily
including NAME attribute.
Examples:

Create a table with namespace=ns1 and table qualifier=t1
hbase> create ‘ns1:t1’, {NAME => ‘f1’, VERSIONS => 5}

Create a table with namespace=default and table qualifier=t1
hbase> create ‘t1’, {NAME => ‘f1’}, {NAME => ‘f2’}, {NAME => ‘f3’}
hbase> # The above in shorthand would be the following:
hbase> create ‘t1’, ‘f1’, ‘f2’, ‘f3’
hbase> create ‘t1’, {NAME => ‘f1’, VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}
hbase> create ‘t1’, {NAME => ‘f1’, CONFIGURATION => {‘hbase.hstore.blockingStoreFiles’ => ’10’}}
hbase> create ‘t1’, {NAME => ‘f1’, IS_MOB => true, MOB_THRESHOLD => 1000000, MOB_COMPACT_PARTITION_POLICY => ‘weekly’}

Table configuration options can be put at the end.
Examples:

hbase> create ‘ns1:t1’, ‘f1′, SPLITS => [’10’, ’20’, ’30’, ’40’]
hbase> create ‘t1’, ‘f1′, SPLITS => [’10’, ’20’, ’30’, ’40’]
hbase> create ‘t1’, ‘f1’, SPLITS_FILE => ‘splits.txt’, OWNER => ‘johndoe’
hbase> create ‘t1’, {NAME => ‘f1’, VERSIONS => 5}, METADATA => { ‘mykey’ => ‘myvalue’ }
hbase> # Optionally pre-split the table into NUMREGIONS, using
hbase> # SPLITALGO (“HexStringSplit”, “UniformSplit” or classname)
hbase> create ‘t1’, ‘f1’, {NUMREGIONS => 15, SPLITALGO => ‘HexStringSplit’}
hbase> create ‘t1’, ‘f1’, {NUMREGIONS => 15, SPLITALGO => ‘HexStringSplit’, REGION_REPLICATION => 2, CONFIGURATION => {‘hbase.hregion.scan.loadColumnFamiliesOnDemand’ => ‘true’}}
hbase> create ‘t1’, {NAME => ‘f1’, DFS_REPLICATION => 1}

You can also keep around a reference to the created table:

hbase> t1 = create ‘t1’, ‘f1’

Which gives you a reference to the table named ‘t1’, on which you can then
call methods.
describeDescribe the named table. For example:
hbase> describe ‘t1’
hbase> describe ‘ns1:t1’

Alternatively, you can use the abbreviated ‘desc’ for the same thing.
hbase> desc ‘t1’
hbase> desc ‘ns1:t1’
disableStart disable of named table:
hbase> disable ‘t1’
hbase> disable ‘ns1:t1’
disable_all
dropDrop the named table. Table must first be disabled:

hbase> drop ‘t1’
hbase> drop ‘ns1:t1’
drop_allDrop all of the tables matching the given regex:

hbase> drop_all ‘t.*’
hbase> drop_all ‘ns:t.*’
hbase> drop_all ‘ns:.*’
enable
hbase> enable ‘t1’
hbase> enablStart enable of named table:
e ‘ns1:t1’
enable_allEnable all of the tables matching the given regex:

hbase> enable_all ‘t.*’
hbase> enable_all ‘ns:t.*’
hbase> enable_all ‘ns:.*’
existsDoes the named table exist?

hbase> exists ‘t1’
hbase> exists ‘ns1:t1’
get_tableGet the given table name and return it as an actual object to
be manipulated by the user. See table.help for more information
on how to use the table.
Eg.

hbase> t1 = get_table ‘t1’
hbase> t1 = get_table ‘ns1:t1’

returns the table named ‘t1’ as a table object. You can then do

hbase> t1.help

which will then print the help for that table.
is_disabledIs named table disabled? For example:
hbase> is_disabled ‘t1’
hbase> is_disabled ‘ns1:t1’
is_enabledIs named table enabled? For example:
hbase> is_enabled ‘t1’
hbase> is_enabled ‘ns1:t1’
listList all tables
TABLE
hbase> list
table1
table2
locate_regionLocate the region given a table name and a row-key

hbase> locate_region ‘tableName’, ‘key0’
show_filtersDependentColumhbas> show_filters
nFilter
KeyOnlyFilter
ColumnCountGetFilter
SingleColumnValueFilter
PrefixFilter
SingleColumnValueExcludeFilter
FirstKeyOnlyFilter
ColumnRangeFilter
TimestampsFilter
FamilyFilter
QualifierFilter
ColumnPrefixFilter
RowFilter
MultipleColumnPrefixFilter
InclusiveStopFilter
PageFilter
ValueFilter
ColumnPaginationFilter

Configuration Syntax and Usage

COMMANDSUSAGE & EXAMPLES
add_peerA peer can either be another HBase cluster or a custom replication endpoint. In either case an id
must be specified to identify the peer.

For a HBase cluster peer, a cluster key must be provided and is composed like this:
hbase.zookeeper.quorum:hbase.zookeeper.property.clientPort:zookeeper.znode.parent
This gives a full path for HBase to connect to another HBase cluster. An optional parameter for
table column families identifies which column families will be replicated to the peer cluster.
Examples:

hbase> add_peer ‘1’, “server1.cie.com:2181:/hbase”
hbase> add_peer ‘2’, “zk1,zk2,zk3:2182:/hbase-prod”
hbase> add_peer ‘3’, “zk4,zk5,zk6:11000:/hbase-test”, “table1; table2:cf1; table3:cf1,cf2”
hbase> add_peer ‘4’, CLUSTER_KEY => “server1.cie.com:2181:/hbase”
hbase> add_peer ‘5’, CLUSTER_KEY => “zk1,zk2,zk3:2182:/hbase-prod”,
TABLE_CFS => { “table1” => [], “ns2:table2” => [“cf1”], “ns3:table3” => [“cf1”, “cf2”] }

For a custom replication endpoint, the ENDPOINT_CLASSNAME can be provided. Two optional arguments
are DATA and CONFIG which can be specified to set different either the peer_data or configuration
for the custom replication endpoint. Table column families is optional and can be specified with
the key TABLE_CFS.

hbase> add_peer ‘6’, ENDPOINT_CLASSNAME => ‘org.apache.hadoop.hbase.MyReplicationEndpoint’
hbase> add_peer ‘7’, ENDPOINT_CLASSNAME => ‘org.apache.hadoop.hbase.MyReplicationEndpoint’,
DATA => { “key1” => 1 }
hbase> add_peer ‘8’, ENDPOINT_CLASSNAME => ‘org.apache.hadoop.hbase.MyReplicationEndpoint’,
CONFIG => { “config1” => “value1”, “config2” => “value2” }
hbase> add_peer ‘9’, ENDPOINT_CLASSNAME => ‘org.apache.hadoop.hbase.MyReplicationEndpoint’,
DATA => { “key1” => 1 }, CONFIG => { “config1” => “value1”, “config2” => “value2” },
hbase> add_peer ’10’, ENDPOINT_CLASSNAME => ‘org.apache.hadoop.hbase.MyReplicationEndpoint’,
TABLE_CFS => { “table1” => [], “ns2:table2” => [“cf1”], “ns3:table3” => [“cf1”, “cf2”] }
hbase> add_peer ’11’, ENDPOINT_CLASSNAME => ‘org.apache.hadoop.hbase.MyReplicationEndpoint’,
DATA => { “key1” => 1 }, CONFIG => { “config1” => “value1”, “config2” => “value2” },
TABLE_CFS => { “table1” => [], “table2” => [“cf1”], “table3” => [“cf1”, “cf2”] }

Note: Either CLUSTER_KEY or ENDPOINT_CLASSNAME must be specified but not both.
append_peer_tableCFsAppend a replicable table-cf config for the specified peer
Examples:

# append a table / table-cf to be replicable for a peer
hbase> append_peer_tableCFs ‘2’, { “ns1:table4” => [“cfA”, “cfB”] }
disable_peerStops the replication stream to the specified cluster, but still
keeps track of new edits to replicate.

Examples:

hbase> disable_peer ‘1’
disable_table_replicationDisable a table’s replication switch.

Examples:

hbase> disable_table_replication ‘table_name’
enable_peerRestarts the replication to the specified peer cluster,
continuing from where it was disabled.

Examples:

hbase> enable_peer ‘1’
enable_table_replicationEnable a table’s replication switch.

Examples:

hbase> enable_table_replication ‘table_name’
get_peer_configOutputs the cluster key, replication endpoint class (if present), and any replication configuration parameters
list_peer_configsNo-argument method that outputs the replication peer configuration for each peer defined on this cluster.
list_peersList all replication peer clusters.
hbase> list_peers
list_replicated_tablesList all the tables and column families replicated from this cluster

hbase> list_replicated_tables
hbase> list_replicated_tables ‘abc.*’
remove_peerStops the specified replication stream and deletes all the meta
information kept about it. Examples:

hbase> remove_peer ‘1’
remove_peer_tableCFsRemove a table / table-cf from the table-cfs config for the specified peer
Examples:

# Remove a table / table-cf from the replicable table-cfs for a peer
hbase> remove_peer_tableCFs ‘2’, { “ns1:table1” => [] }
hbase> remove_peer_tableCFs ‘2’, { “ns1:table1” => [“cf1”] }
set_peer_tableCFsSet the replicable table-cf config for the specified peer
Examples:

# set all tables to be replicable for a peer
hbase> set_peer_tableCFs ‘1’, “”
hbase> set_peer_tableCFs ‘1’
# set table / table-cf to be replicable for a peer, for a table without
# an explicit column-family list, all replicable column-families (with
# replication_scope == 1) will be replicated
hbase> set_peer_tableCFs ‘2’, { “ns1:table1” => [],
“ns2:table2” => [“cf1”, “cf2”],
“ns3:table3” => [“cfA”, “cfB”] }
show_peer_tableCFsShow replicable table-cf config for the specified peer.

hbase> show_peer_tableCFs
update_peer_configA peer can either be another HBase cluster or a custom replication endpoint. In either case an id
must be specified to identify the peer. This command does not interrupt processing on an enabled replication peer.

Two optional arguments are DATA and CONFIG which can be specified to set different values for either
the peer_data or configuration for a custom replication endpoint. Any existing values not updated by this command
are left unchanged.

CLUSTER_KEY, REPLICATION_ENDPOINT, and TABLE_CFs cannot be updated with this command.
To update TABLE_CFs, see the append_peer_tableCFs and remove_peer_tableCFs commands.

hbase> update_peer_config ‘1’, DATA => { “key1” => 1 }
hbase> update_peer_config ‘2’, CONFIG => { “config1” => “value1”, “config2” => “value2” }
hbase> update_peer_config ‘3’, DATA => { “key1” => 1 }, CONFIG => { “config1” => “value1”, “config2” => “value2” },

Quota Syntax and Usage

COMMANDSSECURITY HBASE SHELL COMMANDS USAGE
grantGrant users specific rights.
Syntax : grant , [, <@namespace> [, [, [, ]]]

permissions is either zero or more letters from the set “RWXCA”.
READ(‘R’), WRITE(‘W’), EXEC(‘X’), CREATE(‘C’), ADMIN(‘A’)

Note: Groups and users are granted access in the same way, but groups are prefixed with an ‘@’
character. In the same way, tables and namespaces are specified, but namespaces are
prefixed with an ‘@’ character.

For example:

hbase> grant ‘bobsmith’, ‘RWXCA’
hbase> grant ‘@admins’, ‘RWXCA’
hbase> grant ‘bobsmith’, ‘RWXCA’, ‘@ns1’
hbase> grant ‘bobsmith’, ‘RW’, ‘t1’, ‘f1’, ‘col1’
hbase> grant ‘bobsmith’, ‘RW’, ‘ns1:t1’, ‘f1’, ‘col1’
list_security_capabilitiesList supported security capabilities

Example:
hbase> list_security_capabilities
revokeRevoke a user’s access rights.
Syntax : revoke [, <@namespace> [, [, [, ]]]]

Note: Groups and users access are revoked in the same way, but groups are prefixed with an ‘@’
character. In the same way, tables and namespaces are specified, but namespaces are
prefixed with an ‘@’ character.

For example:

hbase> revoke ‘bobsmith’
hbase> revoke ‘@admins’
hbase> revoke ‘bobsmith’, ‘@ns1’
hbase> revoke ‘bobsmith’, ‘t1’, ‘f1’, ‘col1’
hbase> revoke ‘bobsmith’, ‘ns1:t1’, ‘f1’, ‘col1’
user_permissionShow all permissions for the particular user.
Syntax : user_permission

Note: A namespace must always precede with ‘@’ character.

For example:

hbase> user_permission
hbase> user_permission ‘@ns1’
hbase> user_permission ‘@.*’
hbase> user_permission ‘@^[a-c].*’
hbase> user_permission ‘table1’
hbase> user_permission ‘namespace1:table1’
hbase> user_permission ‘.*’
hbase> user_permission ‘^[A-C].*’

HBase Security Syntax and Usage

Note: Security commands are only applicable if running with the AccessController coprocessor. These HBase shell commands are mostly used by an admin to make & provide security to the database and tables.

PAIR RDD ACTION FUNCTIONSFUNCTION DESCRIPTION
collectAsMapReturns the pair RDD as a Map to the Spark Master.
countByKeyReturns the count of each key elements. This returns the final result to local Map which is your driver.
countByKeyApproxSame as countByKey but returns the partial result. This takes a timeout as parameter to specify how long this function to run before returning.
lookupReturns a list of values from RDD for a given input key.
reduceByKeyLocallyReturns a merged RDD by merging the values of each key and final result will be sent to the master.
saveAsHadoopDatasetSaves RDD to any hadoop supported file system (HDFS, S3, ElasticSearch, e.t.c), It uses Hadoop JobConf object to save.
saveAsHadoopFileSaves RDD to any hadoop supported file system (HDFS, S3, ElasticSearch, e.t.c), It uses Hadoop OutputFormat class to save.
saveAsNewAPIHadoopDatasetSaves RDD to any hadoop supported file system (HDFS, S3, ElasticSearch, e.t.c) with new Hadoop API, It uses Hadoop Configuration object to save.
saveAsNewAPIHadoopFileSaves RDD to any hadoop supported fule system (HDFS, S3, ElasticSearch, e.t.c), It uses new Hadoop API OutputFo

Procedure Syntax and Usage

COMMANDSUSAGE & EXAMPLES
add_rsgroupCreate a new RegionServer group.

Example:

hbase> add_rsgroup ‘my_group’
balance_rsgroupBalance a RegionServer group

Example:

hbase> balance_rsgroup ‘my_group’
get_rsgroupGet a RegionServer group’s information.

Example:

hbase> get_rsgroup ‘default’
get_server_rsgroupGet the group name the given RegionServer is a member of.

Example:

hbase> get_server_rsgroup ‘server1:port1’
get_table_rsgroupGet the RegionServer group name the given table is a member of.

Example:

hbase> get_table_rsgroup ‘myTable’
list_rsgroupsList all RegionServer groups. Optional regular expression parameter can
be used to filter the output.

Example:

hbase> list_rsgroups
hbase> list_rsgroups ‘abc.*’
move_servers_rsgroupReassign a region server from one RSGroup to another.

hbase> move_servers_rsgroup ‘dest’,[‘server1:port’,’server2:port’]
move_tables_rsgroupReassign tables from one RSGroup to another.

hbase> move_tables_rsgroup ‘dest’,[‘table1′,’table2’]
remove_rsgroupRemove a RegionServer group.

hbase> remove_rsgroup ‘my_group’

Visibility Label Syntax and Usage

ARRAY FUNCTION SYNTAXARRAY FUNCTION DESCRIPTION
array_contains(column: Column, value: Any)Check if a value presents in an array column. Return below values.
true – Returns if value presents in an array.
false – When valu eno presents.
null – when array is null.
array_distinct(e: Column)Return distinct values from the array after removing duplicates.
array_except(col1: Column, col2: Column)Returns all elements from col1 array but not in col2 array.
array_intersect(col1: Column, col2: Column)Returns all elements that are present in col1 and col2 arrays.
array_join(column: Column, delimiter: String, nullReplacement: String)
array_join(column: Column, delimiter: String)
Concatenates all elments of array column with using provided delimeter. When Null valeus are present, they replaced with ‘nullReplacement’ string
array_max(e: Column)Return maximum values in an array
array_min(e: Column)Return minimum values in an array
array_position(column: Column, value: Any)Returns a position/index of first occurrence of the ‘value’ in the given array. Returns position as long type and the position is not zero based instead starts with 1.
Returns zero when value is not found.
Returns null when any of the arguments are null.
array_remove(column: Column, element: Any)Returns an array after removing all provided ‘value’ from the given array.
array_repeat(e: Column, count: Int)Creates an array containing the first argument repeated the number of times given by the second argument.
array_repeat(left: Column, right: Column)Creates an array containing the first argument repeated the number of times given by the second argument.
array_sort(e: Column)Returns the sorted array of the given input array. All null values are placed at the end of the array.
array_union(col1: Column, col2: Column)Returns an array of elements that are present in both arrays (all elements from both arrays) with out duplicates.
arrays_overlap(a1: Column, a2: Column)true – if `a1` and `a2` have at least one non-null element in common
false – if `a1` and `a2` have completely different elements.
null – if both the arrays are non-empty and any of them contains a `null`
arrays_zip(e: Column*)Returns a merged array of structs in which the N-th struct contains all N-th values of input
concat(exprs: Column*)Concatenates all elements from a given columns
element_at(column: Column, value: Any)Returns an element of an array located at the ‘value’ input position.
exists(column: Column, f: Column => Column)Checks if the column presents in an array column.
explode(e: Column)Create a row for each element in the array column
explode_outer ( e : Column )Create a row for each element in the array column. Unlike explode, if the array is null or empty, it returns null.
filter(column: Column, f: Column => Column)
filter(column: Column, f: (Column, Column) => Column)
Returns an array of elements for which a predicate holds in a given array
flatten(e: Column)Creates a single array from an array of arrays column.
forall(column: Column, f: Column => Column)Returns whether a predicate holds for every element in the array.
posexplode(e: Column)Creates a row for each element in the array and creaes a two columns “pos’ to hold the position of the array element and the ‘col’ to hold the actual array value.
posexplode_outer(e: Column)Creates a row for each element in the array and creaes a two columns “pos’ to hold the position of the array element and the ‘col’ to hold the actual array value. Unlike posexplode, if the array is null or empty, it returns null,null for pos and col columns.
reverse(e: Column)Returns the array of elements in a reverse order.
sequence(start: Column, stop: Column)Generate the sequence of numbers from start to stop number.
sequence ( start : Column , stop : Column , step : Column )Generate the sequence of numbers from start to stop number by incrementing with given step value.
shuffle(e: Column)Shuffle the given array
size(e: Column)Return the length of an array.
slice(x: Column, start: Int, length: Int)Returns an array of elements from position ‘start’ and the given length.
sort_array(e: Column)Sorts the array in an ascending order. Null values are placed at the beginning.
sort_array(e: Column, asc: Boolean)Sorts the array in an ascending or descending order based of the boolean parameter. For assending, Null values are placed at the beginning. And for desending they are places at the end.
transform(column: Column, f: Column => Column)
transform(column: Column, f: (Column, Column) => Column)
Returns an array of elments after applying transformation.
zip_with(left: Column, right: Column, f: (Column, Column) => Column)Merges two input arrays.
aggregate(
expr: Column,
zero: Column,
merge: (Column, Column) => Column,
finish: Column => Column)
Aggregates

Rsgroup HBase Shell Commands

Note: The rsgroup Coprocessor Endpoint must be enabled on the Master else commands fail with:
UnknownProtocolException: No registered Master Coprocessor Endpoint found for RSGroupAdminService

COMMANDSUSAGE & EXAMPLES
assignAssign a region. Use with caution. If region already assigned,
this command will do a force reassign. For experts only.
Examples:

hbase> assign ‘REGIONNAME’
hbase> assign ‘ENCODED_REGIONNAME’
balance_switchEnable/Disable balancer. Returns previous balancer state.
Examples:

hbase> balance_switch true
hbase> balance_switch false
balancerTrigger the cluster balancer. Returns true if balancer ran and was able to
tell the region servers to unassign all the regions to balance (the re-assignment itself is async).
Otherwise false (Will not run if regions in transition).

Command: balancer_enabled
Query the balancer’s state.
Examples:

hbase> balancer_enabled
catalogjanitor_enabledQuery for the CatalogJanitor state (enabled/disabled?)
Examples:

hbase> catalogjanitor_enabled
catalogjanitor_runCatalog janitor command to run the (garbage collection) scan from command line.

hbase> catalogjanitor_run
catalogjanitor_switchEnable/Disable CatalogJanitor. Returns previous CatalogJanitor state.
Examples:

hbase> catalogjanitor_switch true
hbase> catalogjanitor_switch false
close_regionClose a single region. Ask the master to close a region out on the cluster
or if ‘SERVER_NAME’ is supplied, ask the designated hosting regionserver to
close the region directly. Closing a region, the master expects ‘REGIONNAME’
to be a fully qualified region name. When asking the hosting regionserver to
directly close a region, you pass the regions’ encoded name only. A region
name looks like this:

TestTable,0094429456,1289497600452.527db22f95c8a9e0116f0cc13c680396.
or
Namespace:TestTable,0094429456,1289497600452.527db22f95c8a9e0116f0cc13c680396.

The trailing period is part of the regionserver name. A region’s encoded name
is the hash at the end of a region name; e.g. 527db22f95c8a9e0116f0cc13c680396
(without the period). A ‘SERVER_NAME’ is its host, port plus startcode. For
example: host187.example.com,60020,1289493121758 (find servername in master ui
or when you do detailed status in shell). This command will end up running
close on the region hosting regionserver. The close is done without the
master’s involvement (It will not know of the close). Once closed, region will
stay closed. Use assign to reopen/reassign. Use unassign or move to assign
the region elsewhere on cluster. Use with caution. For experts only.
Examples:

hbase> close_region ‘REGIONNAME’
hbase> close_region ‘REGIONNAME’, ‘SERVER_NAME’
hbase> close_region ‘ENCODED_REGIONNAME’
hbase> close_region ‘ENCODED_REGIONNAME’, ‘SERVER_NAME’
compactCompact all regions in passed table or pass a region row
to compact an individual region. You can also compact a single column
family within a region.
Examples:
Compact all regions in a table:
hbase> compact ‘ns1:t1’
hbase> compact ‘t1’
Compact an entire region:
hbase> compact ‘r1’
Compact only a column family within a region:
hbase> compact ‘r1’, ‘c1’
Compact a column family within a table:
hbase> compact ‘t1’, ‘c1’
compact_mobRun compaction on a mob enabled column family
or all mob enabled column families within a table
Examples:
Compact a column family within a table:
hbase> compact_mob ‘t1’, ‘c1’
Compact all mob enabled column families
hbase> compact_mob ‘t1’
compact_rsCompact all regions on passed regionserver.
Examples:
Compact all regions on a regionserver:
hbase> compact_rs ‘host187.example.com,60020’
or
hbase> compact_rs ‘host187.example.com,60020,1289493121758’
Major compact all regions on a regionserver:
hbase> compact_rs ‘host187.example.com,60020,1289493121758’, true
flushFlush all regions in passed table or pass a region row to
flush an individual region. For example:

hbase> flush ‘TABLENAME’
hbase> flush ‘REGIONNAME’
hbase> flush ‘ENCODED_REGIONNAME’
major_compactRun major compaction on passed table or pass a region row
to major compact an individual region. To compact a single
column family within a region specify the region name
followed by the column family name.
Examples:
Compact all regions in a table:
hbase> major_compact ‘t1’
hbase> major_compact ‘ns1:t1’
Compact an entire region:
hbase> major_compact ‘r1’
Compact a single column family within a region:
hbase> major_compact ‘r1’, ‘c1’
Compact a single column family within a table:
hbase> major_compact ‘t1’, ‘c1’
major_compact_mobRun major compaction on a mob enabled column family
or all mob enabled column families within a table
Examples:
Compact a column family within a table:
hbase> major_compact_mob ‘t1’, ‘c1’
Compact all mob enabled column families within a table
hbase> major_compact_mob ‘t1’
merge_regionMerge two regions. Passing ‘true’ as the optional third parameter will force
a merge (‘force’ merges regardless else merge will fail unless passed
adjacent regions. ‘force’ is for expert use only).

NOTE: You must pass the encoded region name, not the full region name so
this command is a little different from other region operations. The encoded
region name is the hash suffix on region names: e.g. if the region name were
TestTable,0094429456,1289497600452.527db22f95c8a9e0116f0cc13c680396. then
the encoded region name portion is 527db22f95c8a9e0116f0cc13c680396

Examples:

hbase> merge_region ‘ENCODED_REGIONNAME’, ‘ENCODED_REGIONNAME’
hbase> merge_region ‘ENCODED_REGIONNAME’, ‘ENCODED_REGIONNAME’, true
moveMove a region. Optionally specify target regionserver else we choose one
at random. NOTE: You pass the encoded region name, not the region name so
this command is a little different to the others. The encoded region name
is the hash suffix on region names: e.g. if the region name were
TestTable,0094429456,1289497600452.527db22f95c8a9e0116f0cc13c680396. then
the encoded region name portion is 527db22f95c8a9e0116f0cc13c680396
A server name is its host, port plus startcode. For example:
host187.example.com,60020,1289493121758
Examples:

hbase> move ‘ENCODED_REGIONNAME’
hbase> move ‘ENCODED_REGIONNAME’, ‘SERVER_NAME’
normalizeTrigger region normalizer for all tables which have NORMALIZATION_ENABLED flag set. Returns true
if normalizer ran successfully, false otherwise. Note that this command has no effect
if region normalizer is disabled (make sure it’s turned on using ‘normalizer_switch’ command).

Examples:

hbase> normalize
normalizer_enabledQuery the state of region normalizer.
Examples:

hbase> normalizer_enabled
normalizer_switchEnable/Disable region normalizer. Returns previous normalizer state.
When normalizer is enabled, it handles all tables with ‘NORMALIZATION_ENABLED’ => true.
Examples:

hbase> normalizer_switch true
hbase> normalizer_switch false
splitSplit entire table or pass a region to split individual region. With the
second parameter, you can specify an explicit split key for the region.
Examples:
split ‘tableName’
split ‘namespace:tableName’
split ‘regionName’ # format: ‘tableName,startKey,id’
split ‘tableName’, ‘splitKey’
split ‘regionName’, ‘splitKey’
traceStart or Stop tracing using HTrace.
Always returns true if tracing is running, otherwise false.
If the first argument is ‘start’, new span is started.
If the first argument is ‘stop’, current running span is stopped.
(‘stop’ returns false on success.)
If the first argument is ‘status’, just returns if or not tracing is running.
On ‘start’-ing, you can optionally pass the name of span as the second argument.
The default name of span is ‘HBaseShell’.
Repeating ‘start’ does not start nested span.

Examples:

hbase> trace ‘start’
hbase> trace ‘status’
hbase> trace ‘stop’

hbase> trace ‘start’, ‘MySpanName’
hbase> trace ‘stop’
unassignUnassign a region. Unassign will close region in current location and then
reopen it again. Pass ‘true’ to force the unassignment (‘force’ will clear
all in-memory state in master before the reassign. If results in
double assignment use hbck -fix to resolve. To be used by experts).
Use with caution. For expert use only. Examples:

hbase> unassign ‘REGIONNAME’
hbase> unassign ‘REGIONNAME’, true
hbase> unassign ‘ENCODED_REGIONNAME’
hbase> unassign ‘ENCODED_REGIONNAME’, true
wal_rollRoll the log writer. That is, start writing log messages to a new file.
The name of the regionserver should be given as the parameter. A
‘server_name’ is the host, port plus startcode of a regionserver. For
example: host187.example.com,60020,1289493121758 (find servername in
master ui or when you do detailed status in shell)
zk_dumpDump status of HBase cluster as seen by ZooKee

Conclusion:

We have seen HBase shell commands are broken down into several different groups, each serves a different purpose and also have seen examples, usage, and description of each command to interact with HBase. I hope it helps !!

Happy Learning !!

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium