HBase Shell commands are broken down into 13 groups to interact with HBase Database via HBase shell, let’s see usage, syntax, description, and examples of each in this article. From the below tables, the first table describes groups and all its commands in a cheat sheet and the remaining tables provide the detail description of each group and its commands.
HBase Shell Commands by Group
On the below table click on links to check usage, description, and examples for each HBase shell group or commands. You can also get the usage of each by running help ‘<command>’ | ‘<group-name>’ or just entering command name without parameters on the HBase shell.
If you do not have HBase setup and running on your system, I would recommend to have the setup and start using the Hbase shell.
While trying these commands, make sure table names, rows, columns all enclosed in quote characters.
COMMANDS | USAGE & EXAMPLES |
---|---|
list_quotas | You can filter the result based on USER, TABLE, or NAMESPACE. For example: hbase> list_quotas hbase> list_quotas USER => ‘bob.*’ hbase> list_quotas USER =List the quota settings added to the system. > ‘bob.*’, TABLE => ‘t1’ hbase> list_quotas USER => ‘bob.*’, NAMESPACE => ‘ns.*’ hbase> list_quotas TABLE => ‘myTable’ hbase> list_quotas NAMESPACE => ‘ns.*’ |
set_quota | Syntax : set_quota TYPE => , TYPE => THROTTLE User can either set quota on read, write or on both the requests together(i.e., read+write) The read, write, or readSet a quota for a user, table, or namespace. +write(default throttle type) request limit can be expressed using the form 100req/sec, 100req/min and the read, write, read+write(default throttle type) limit can be expressed using the form 100k/sec, 100M/min with (B, K, M, G, T, P) as valid size unit and (sec, min, hour, day) as valid time unit. Currently the throttle limit is per machine – a limit of 100req/min means that each machine can execute 100req/min. For example: hbase> set_quota TYPE => THROTTLE, USER => ‘u1′, LIMIT => ’10req/sec’ hbase> set_quota TYPE => THROTTLE, THROTTLE_TYPE => READ, USER => ‘u1′, LIMIT => ’10req/sec’ hbase> set_quota TYPE => THROTTLE, USER => ‘u1′, LIMIT => ’10M/sec’ hbase> set_quota TYPE => THROTTLE, THROTTLE_TYPE => WRITE, USER => ‘u1′, LIMIT => ’10M/sec’ hbase> set_quota TYPE => THROTTLE, USER => ‘u1’, TABLE => ‘t2’, LIMIT => ‘5K/min’ hbase> set_quota TYPE => THROTTLE, USER => ‘u1’, NAMESPACE => ‘ns2’, LIMIT => NONE hbase> set_quota TYPE => THROTTLE, NAMESPACE => ‘ns1′, LIMIT => ’10req/sec’ hbase> set_quota TYPE => THROTTLE, TABLE => ‘t1′, LIMIT => ’10M/sec’ hbase> set_quota TYPE => THROTTLE, THROTTLE_TYPE => WRITE, TABLE => ‘t1′, LIMIT => ’10M/sec’ hbase> set_quota TYPE => THROTTLE, USER => ‘u1’, LIMIT => NONE hbase> set_quota TYPE => THROTTLE, THROTTLE_TYPE => WRITE, USER => ‘u1’, LIMIT => NONE hbase> set_quota USER => ‘u1’, GLOBAL_BYPASS => true |
HBase General Shell Commands
These shell commands are commonly used to identify the version, status, of the database.
COMMAND | USAGE & EXAMPLES |
---|---|
alter_namespace | Alter namespace properties. To add/modify a property: hbase> alter_namespace ‘ns1’, {METHOD => ‘set’, ‘PROPERTY_NAME’ => ‘PROPERTY_VALUE’} To delete a property: hbase> alter_namespace ‘ns1’, {METHOD => ‘unset’, NAME=>’PROPERTY_NAME’} |
create_namespace | Create namespace; pass namespace name, and optionally a dictionary of namespace configuration. Examples: hbase> create_namespace ‘ns1’ hbase> create_namespace ‘ns1’, {‘PROPERTY_NAME’=>’PROPERTY_VALUE’} |
describe_namespace | Describe the named namespace. For example: hbase> describe_namespace ‘ns1’ |
drop_namespace | Drop the named namespace. The namespace must be empty. |
list_namespace | List all namespaces in hbase. Optional regular expression parameter could be used to filter the output. Examples: hbase> list_namespace hbase> list_namespace ‘abc.*’ |
list_namespace_tables | List all tables that are members of the namespace. Examples: hbase> list_namespace_tables ‘ns1’ |
Data Manipulation Language (DML) Shell Commands
DML HBase shell commands include most commonly used commands to modify the data, for example, put – is used to insert the rows to the tables, get & scan – are used to retrieve the data, delete & truncate – are used to delete the data, append – is used to append the cells and there are many commands
Data Definition Language (DDL) Shell Commands
DDL HBase shell commands are another set of commands used mostly to change the structure of the table, for example, alter – is used to delete column family from a table or any alteration to the table. before you run alter make sure you disable the table first. create – is used to create a table, drop – to drop the table and many more.
PAIR RDD FUNCTIONS | FUNCTION DESCRIPTION |
---|---|
aggregateByKey | Aggregate the values of each key in a data set. This function can return a different result type then the values in input RDD. |
combineByKey | Combines the elements for each key. |
combineByKeyWithClassTag | Combines the elements for each key. |
flatMapValues | It’s flatten the values of each key with out changing key values and keeps the original RDD partition. |
foldByKey | Merges the values of each key. |
groupByKey | Returns the grouped RDD by grouping the values of each key. |
mapValues | It applied a map function for each value in a pair RDD with out changing keys. |
reduceByKey | Returns a merged RDD by merging the values of each key. |
reduceByKeyLocally | Returns a merged RDD by merging the values of each key and final result will be sent to the master. |
sampleByKey | Returns the subset of the RDD. |
subtractByKey | Return an RDD with the pairs from this whose keys are not in other. |
keys | Returns all keys of this RDD as a RDD[T]. |
values | Returns an RDD with just values. |
partitionBy | Returns a new RDD after applying specified partitioner. |
fullOuterJoin | Return RDD after applying fullOuterJoin on current and parameter RDD |
join | Return RDD after applying join on current and parameter RDD |
leftOuterJoin | Return RDD after applying leftOuterJoin on current and parameter RDD |
rightOuterJoin | Return RDD after applying rightOuterJoin on current and parameter RDD |
Namespace Commands Syntax and Usage
This group contains commands to alter & create the namespace of the HBase database.
NAME | PROCEDURE HBASE SHELL COMMANDS USAGE |
---|---|
abort_procedure | n If this command is accepted and the procedure is in the process of aborting, it will rot be abortable. For experts only.default is true), abort a procedure in hbase. Use with caution. Some procedures eGiven amight procedure Id (and optional boolean may_interrupt_if_running parameter, turn true; if the procedure could not be aborted (eg. procedure does not exist, or procedure already completed or abort will cause corruption), this command will return false. Examples: hbase> abort_procedure proc_id hbase> abort_procedure proc_id, true hbase> abort_procedure proc_id, false |
list_procedures | List all procedures in hbase. For example: hbase> list_procedures |
Tool Syntax and Usage
COMMANDS | USAGE & EXAMPLES |
---|---|
grant | Grant users specific rights. permissions is either zero or more letters from the set “RWXCA”. READ(‘R’), WRITE(‘W’), EXEC(‘X’), CREATE(‘C’), ADMIN(‘A’) Note: Groups and users are granted access in the same way, but groups are prefixed with an ‘@’ character. In the same way, tables and namespaces are specified, but namespaces are prefixed with an ‘@’ character. For example: hbase> grant ‘bobsmith’, ‘RWXCA’ hbase> grant ‘@admins’, ‘RWXCA’ hbase> grant ‘bobsmith’, ‘RWXCA’, ‘@ns1’ hbase> grant ‘bobsmith’, ‘RW’, ‘t1’, ‘f1’, ‘col1’ hbase> grant ‘bobsmith’, ‘RW’, ‘ns1:t1’, ‘f1’, ‘col1’ |
list_security_capabilities | List supported security capabilities Example: hbase> list_security_capabilities |
revoke | Revoke a user’s access rights. Note: Groups and users access are revoked in the same way, but groups are prefixed with an ‘@’ character. In the same way, tables and namespaces are specified, but namespaces are prefixed with an ‘@’ character. For example: hbase> revoke ‘bobsmith’ hbase> revoke ‘@admins’ hbase> revoke ‘bobsmith’, ‘@ns1’ hbase> revoke ‘bobsmith’, ‘t1’, ‘f1’, ‘col1’ hbase> revoke ‘bobsmith’, ‘ns1:t1’, ‘f1’, ‘col1’ |
user_permission | Show all permissions for the particular user. Syntax : user_permission table Note: A namespace must always precede with ‘@’ character. For example: hbase> user_permission hbase> user_permission ‘@ns1’ hbase> user_permission ‘@.*’ hbase> user_permission ‘@^[a-c].*’ hbase> user_permission ‘table1’ hbase> user_permission ‘namespace1:table1’ hbase> user_permission ‘.*’ hbase> user_permission ‘^[A-C].*’ |
Replication Syntax and Usage
Note: In order to use these tools, hbase.replication
must be set to true and commands in these groups are mainly used to add or remove a peer from an HBase cluster.
COMMANDS | USAGE & EXAMPLES |
---|---|
clone_snapshot | Create a new table by cloning the snapshot content. There’re no copies of data involved. And writing on the newly created table will not influence the snapshot data. Examples: hbase> clone_snapshot ‘snapshotName’, ‘tableName’ hbase> clone_snapshot ‘snapshotName’, ‘namespace:tableName’ |
delete_all_snapshot | Delete all of the snapshots matching the given regex. Examples: hbase> delete_all_snapshot ‘s.*’ |
delete_snapshot | Delete a specified snapshot. Examples: hbase> delete_snapshot ‘snapshotName’, |
list_snapshots | List all snapshots taken (by printing the names and relative information). Optional regular expression parameter could be used to filter the output by snapshot name. Examples: hbase> list_snapshots hbase> list_snapshots ‘abc.*’ |
restore_snapshot | Restore a specified snapshot. The restore will replace the content of the original table, bringing back the content to the snapshot state. The table must be disabled. Examples: hbase> restore_snapshot ‘snapshotName’ |
snapshot | Take a snapshot of specified table. Examples: hbase> snapshot ‘sourceTable’, ‘snapshotName’ hbase> snapshot ‘namespace:sourceTable’, ‘snapshotName’, {SKIP_FLUSH => true} |
Snapshot Shell commands
This group of commands is used to take the snapshot of the database at any given time.
COMMAND | USAGE & EXAMPLES |
---|---|
alter | If the “hbase.online.schema.update.enable” property is set to false, then the table must be disabled (see help ‘disable’). If the “hbase.online.schema.update.enable” property is set to true, tables can be altered without disabling them first. Altering enabled tables has caused problems in the past, so use caution and test it before using in production. You can use the alter command to add, modify or delete column families or change table configuration options. Column families work in a similar way as the ‘create’ command. The column family specification can either be a name string, or a dictionary with the NAME attribute. Dictionaries are described in the output of the ‘help’ command, with no arguments. For example, to change or add the ‘f1’ column family in table ‘t1’ from current value to keep a maximum of 5 cell VERSIONS, do: hbase> alter ‘t1’, NAME => ‘f1’, VERSIONS => 5 You can operate on several column families: hbase> alter ‘t1’, ‘f1’, {NAME => ‘f2’, IN_MEMORY => true}, {NAME => ‘f3’, VERSIONS => 5} To delete the ‘f1’ column family in table ‘ns1:t1’, use one of: hbase> alter ‘ns1:t1’, NAME => ‘f1’, METHOD => ‘delete’ hbase> alter ‘ns1:t1’, ‘delete’ => ‘f1’ You can also change table-scope attributes like MAX_FILESIZE, READONLY, MEMSTORE_FLUSHSIZE, DURABILITY, etc. These can be put at the end; for example, to change the max size of a region to 128MB, do: hbase> alter ‘t1’, MAX_FILESIZE => ‘134217728’ You can add a table coprocessor by setting a table coprocessor attribute: hbase> alter ‘t1’, ‘coprocessor’=>’hdfs:///foo.jar|com.foo.FooRegionObserver|1001|arg1=1,arg2=2’ Since you can have multiple coprocessors configured for a table, a sequence number will be automatically appended to the attribute name to uniquely identify it. The coprocessor attribute must match the pattern below in order for the framework to understand how to load the coprocessor classes: [coprocessor jar file location] | class name | [priority] | [arguments] You can also set configuration settings specific to this table or column family: hbase> alter ‘t1’, CONFIGURATION => {‘hbase.hregion.scan.loadColumnFamiliesOnDemand’ => ‘true’} hbase> alter ‘t1’, {NAME => ‘f2’, CONFIGURATION => {‘hbase.hstore.blockingStoreFiles’ => ’10’}} You can also remove a table-scope attribute: hbase> alter ‘t1’, METHOD => ‘table_att_unset’, NAME => ‘MAX_FILESIZE’ hbase> alter ‘t1’, METHOD => ‘table_att_unset’, NAME => ‘coprocessor$1’ You can also set REGION_REPLICATION: hbase> alter ‘t1’, {REGION_REPLICATION => 2} There could be more than one alteration in one command: hbase> alter ‘t1’, { NAME => ‘f1’, VERSIONS => 3 }, { MAX_FILESIZE => ‘134217728’ }, { METHOD => ‘delete’, NAME => ‘f2’ }, OWNER => ‘johndoe’, METADATA => { ‘mykey’ => ‘myvalue’ } |
alter_async | Alter column family schema, does not wait for all regions to receive the schema changes. Pass table name and a dictionary specifying new column family schema. Dictionaries are described on the main help command output. Dictionary must include name of column family to alter. For example, To change or add the ‘f1’ column family in table ‘t1’ from defaults to instead keep a maximum of 5 cell VERSIONS, do: hbase> alter_async ‘t1’, NAME => ‘f1’, VERSIONS => 5 To delete the ‘f1’ column family in table ‘ns1:t1’, do: hbase> alter_async ‘ns1:t1’, NAME => ‘f1’, METHOD => ‘delete’ or a shorter version: hbase> alter_async ‘ns1:t1’, ‘delete’ => ‘f1’ You can also change table-scope attributes like MAX_FILESIZE MEMSTORE_FLUSHSIZE, READONLY, and DEFERRED_LOG_FLUSH. For example, to change the max size of a family to 128MB, do: hbase> alter ‘t1’, METHOD => ‘table_att’, MAX_FILESIZE => ‘134217728’ There could be more than one alteration in one command: hbase> alter ‘t1’, {NAME => ‘f1’}, {NAME => ‘f2’, METHOD => ‘delete’} To check if all the regions have been updated, use alter_status |
alter_status | Get the status of the alter command. Indicates the number of regions of the table that have received the updated schema Pass table name. hbase> alter_status ‘t1’ hbase> alter_status ‘ns1:t1’ |
create | Creates a table. Pass a table name, and a set of column family specifications (at least one), and, optionally, table configuration. Column specification can be a simple string (name), or a dictionary (dictionaries are described below in main help output), necessarily including NAME attribute. Examples: Create a table with namespace=ns1 and table qualifier=t1 hbase> create ‘ns1:t1’, {NAME => ‘f1’, VERSIONS => 5} Create a table with namespace=default and table qualifier=t1 hbase> create ‘t1’, {NAME => ‘f1’}, {NAME => ‘f2’}, {NAME => ‘f3’} hbase> # The above in shorthand would be the following: hbase> create ‘t1’, ‘f1’, ‘f2’, ‘f3’ hbase> create ‘t1’, {NAME => ‘f1’, VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true} hbase> create ‘t1’, {NAME => ‘f1’, CONFIGURATION => {‘hbase.hstore.blockingStoreFiles’ => ’10’}} hbase> create ‘t1’, {NAME => ‘f1’, IS_MOB => true, MOB_THRESHOLD => 1000000, MOB_COMPACT_PARTITION_POLICY => ‘weekly’} Table configuration options can be put at the end. Examples: hbase> create ‘ns1:t1’, ‘f1′, SPLITS => [’10’, ’20’, ’30’, ’40’] hbase> create ‘t1’, ‘f1′, SPLITS => [’10’, ’20’, ’30’, ’40’] hbase> create ‘t1’, ‘f1’, SPLITS_FILE => ‘splits.txt’, OWNER => ‘johndoe’ hbase> create ‘t1’, {NAME => ‘f1’, VERSIONS => 5}, METADATA => { ‘mykey’ => ‘myvalue’ } hbase> # Optionally pre-split the table into NUMREGIONS, using hbase> # SPLITALGO (“HexStringSplit”, “UniformSplit” or classname) hbase> create ‘t1’, ‘f1’, {NUMREGIONS => 15, SPLITALGO => ‘HexStringSplit’} hbase> create ‘t1’, ‘f1’, {NUMREGIONS => 15, SPLITALGO => ‘HexStringSplit’, REGION_REPLICATION => 2, CONFIGURATION => {‘hbase.hregion.scan.loadColumnFamiliesOnDemand’ => ‘true’}} hbase> create ‘t1’, {NAME => ‘f1’, DFS_REPLICATION => 1} You can also keep around a reference to the created table: hbase> t1 = create ‘t1’, ‘f1’ Which gives you a reference to the table named ‘t1’, on which you can then call methods. |
describe | Describe the named table. For example: hbase> describe ‘t1’ hbase> describe ‘ns1:t1’ Alternatively, you can use the abbreviated ‘desc’ for the same thing. hbase> desc ‘t1’ hbase> desc ‘ns1:t1’ |
disable | Start disable of named table: hbase> disable ‘t1’ hbase> disable ‘ns1:t1’ |
disable_all | |
drop | Drop the named table. Table must first be disabled: hbase> drop ‘t1’ hbase> drop ‘ns1:t1’ |
drop_all | Drop all of the tables matching the given regex: hbase> drop_all ‘t.*’ hbase> drop_all ‘ns:t.*’ hbase> drop_all ‘ns:.*’ |
enable | hbase> enable ‘t1’ hbase> enablStart enable of named table: e ‘ns1:t1’ |
enable_all | Enable all of the tables matching the given regex: hbase> enable_all ‘t.*’ hbase> enable_all ‘ns:t.*’ hbase> enable_all ‘ns:.*’ |
exists | Does the named table exist? hbase> exists ‘t1’ hbase> exists ‘ns1:t1’ |
get_table | Get the given table name and return it as an actual object to be manipulated by the user. See table.help for more information on how to use the table. Eg. hbase> t1 = get_table ‘t1’ hbase> t1 = get_table ‘ns1:t1’ returns the table named ‘t1’ as a table object. You can then do hbase> t1.help which will then print the help for that table. |
is_disabled | Is named table disabled? For example: hbase> is_disabled ‘t1’ hbase> is_disabled ‘ns1:t1’ |
is_enabled | Is named table enabled? For example: hbase> is_enabled ‘t1’ hbase> is_enabled ‘ns1:t1’ |
list | List all tables TABLE hbase> list table1 table2 |
locate_region | Locate the region given a table name and a row-key hbase> locate_region ‘tableName’, ‘key0’ |
show_filters | DependentColumhbas> show_filters nFilter KeyOnlyFilter ColumnCountGetFilter SingleColumnValueFilter PrefixFilter SingleColumnValueExcludeFilter FirstKeyOnlyFilter ColumnRangeFilter TimestampsFilter FamilyFilter QualifierFilter ColumnPrefixFilter RowFilter MultipleColumnPrefixFilter InclusiveStopFilter PageFilter ValueFilter ColumnPaginationFilter |
Configuration Syntax and Usage
COMMANDS | USAGE & EXAMPLES |
---|---|
add_peer | A peer can either be another HBase cluster or a custom replication endpoint. In either case an id must be specified to identify the peer. For a HBase cluster peer, a cluster key must be provided and is composed like this: hbase.zookeeper.quorum:hbase.zookeeper.property.clientPort:zookeeper.znode.parent This gives a full path for HBase to connect to another HBase cluster. An optional parameter for table column families identifies which column families will be replicated to the peer cluster. Examples: hbase> add_peer ‘1’, “server1.cie.com:2181:/hbase” hbase> add_peer ‘2’, “zk1,zk2,zk3:2182:/hbase-prod” hbase> add_peer ‘3’, “zk4,zk5,zk6:11000:/hbase-test”, “table1; table2:cf1; table3:cf1,cf2” hbase> add_peer ‘4’, CLUSTER_KEY => “server1.cie.com:2181:/hbase” hbase> add_peer ‘5’, CLUSTER_KEY => “zk1,zk2,zk3:2182:/hbase-prod”, TABLE_CFS => { “table1” => [], “ns2:table2” => [“cf1”], “ns3:table3” => [“cf1”, “cf2”] } For a custom replication endpoint, the ENDPOINT_CLASSNAME can be provided. Two optional arguments are DATA and CONFIG which can be specified to set different either the peer_data or configuration for the custom replication endpoint. Table column families is optional and can be specified with the key TABLE_CFS. hbase> add_peer ‘6’, ENDPOINT_CLASSNAME => ‘org.apache.hadoop.hbase.MyReplicationEndpoint’ hbase> add_peer ‘7’, ENDPOINT_CLASSNAME => ‘org.apache.hadoop.hbase.MyReplicationEndpoint’, DATA => { “key1” => 1 } hbase> add_peer ‘8’, ENDPOINT_CLASSNAME => ‘org.apache.hadoop.hbase.MyReplicationEndpoint’, CONFIG => { “config1” => “value1”, “config2” => “value2” } hbase> add_peer ‘9’, ENDPOINT_CLASSNAME => ‘org.apache.hadoop.hbase.MyReplicationEndpoint’, DATA => { “key1” => 1 }, CONFIG => { “config1” => “value1”, “config2” => “value2” }, hbase> add_peer ’10’, ENDPOINT_CLASSNAME => ‘org.apache.hadoop.hbase.MyReplicationEndpoint’, TABLE_CFS => { “table1” => [], “ns2:table2” => [“cf1”], “ns3:table3” => [“cf1”, “cf2”] } hbase> add_peer ’11’, ENDPOINT_CLASSNAME => ‘org.apache.hadoop.hbase.MyReplicationEndpoint’, DATA => { “key1” => 1 }, CONFIG => { “config1” => “value1”, “config2” => “value2” }, TABLE_CFS => { “table1” => [], “table2” => [“cf1”], “table3” => [“cf1”, “cf2”] } Note: Either CLUSTER_KEY or ENDPOINT_CLASSNAME must be specified but not both. |
append_peer_tableCFs | Append a replicable table-cf config for the specified peer Examples: # append a table / table-cf to be replicable for a peer hbase> append_peer_tableCFs ‘2’, { “ns1:table4” => [“cfA”, “cfB”] } |
disable_peer | Stops the replication stream to the specified cluster, but still keeps track of new edits to replicate. Examples: hbase> disable_peer ‘1’ |
disable_table_replication | Disable a table’s replication switch. Examples: hbase> disable_table_replication ‘table_name’ |
enable_peer | Restarts the replication to the specified peer cluster, continuing from where it was disabled. Examples: hbase> enable_peer ‘1’ |
enable_table_replication | Enable a table’s replication switch. Examples: hbase> enable_table_replication ‘table_name’ |
get_peer_config | Outputs the cluster key, replication endpoint class (if present), and any replication configuration parameters |
list_peer_configs | No-argument method that outputs the replication peer configuration for each peer defined on this cluster. |
list_peers | List all replication peer clusters. hbase> list_peers |
list_replicated_tables | List all the tables and column families replicated from this cluster hbase> list_replicated_tables hbase> list_replicated_tables ‘abc.*’ |
remove_peer | Stops the specified replication stream and deletes all the meta information kept about it. Examples: hbase> remove_peer ‘1’ |
remove_peer_tableCFs | Remove a table / table-cf from the table-cfs config for the specified peer Examples: # Remove a table / table-cf from the replicable table-cfs for a peer hbase> remove_peer_tableCFs ‘2’, { “ns1:table1” => [] } hbase> remove_peer_tableCFs ‘2’, { “ns1:table1” => [“cf1”] } |
set_peer_tableCFs | Set the replicable table-cf config for the specified peer Examples: # set all tables to be replicable for a peer hbase> set_peer_tableCFs ‘1’, “” hbase> set_peer_tableCFs ‘1’ # set table / table-cf to be replicable for a peer, for a table without # an explicit column-family list, all replicable column-families (with # replication_scope == 1) will be replicated hbase> set_peer_tableCFs ‘2’, { “ns1:table1” => [], “ns2:table2” => [“cf1”, “cf2”], “ns3:table3” => [“cfA”, “cfB”] } |
show_peer_tableCFs | Show replicable table-cf config for the specified peer. hbase> show_peer_tableCFs |
update_peer_config | A peer can either be another HBase cluster or a custom replication endpoint. In either case an id must be specified to identify the peer. This command does not interrupt processing on an enabled replication peer. Two optional arguments are DATA and CONFIG which can be specified to set different values for either the peer_data or configuration for a custom replication endpoint. Any existing values not updated by this command are left unchanged. CLUSTER_KEY, REPLICATION_ENDPOINT, and TABLE_CFs cannot be updated with this command. To update TABLE_CFs, see the append_peer_tableCFs and remove_peer_tableCFs commands. hbase> update_peer_config ‘1’, DATA => { “key1” => 1 } hbase> update_peer_config ‘2’, CONFIG => { “config1” => “value1”, “config2” => “value2” } hbase> update_peer_config ‘3’, DATA => { “key1” => 1 }, CONFIG => { “config1” => “value1”, “config2” => “value2” }, |
Quota Syntax and Usage
COMMANDS | SECURITY HBASE SHELL COMMANDS USAGE |
---|---|
grant | Grant users specific rights. Syntax : grant , [, <@namespace> [, [, [, ]]] permissions is either zero or more letters from the set “RWXCA”. READ(‘R’), WRITE(‘W’), EXEC(‘X’), CREATE(‘C’), ADMIN(‘A’) Note: Groups and users are granted access in the same way, but groups are prefixed with an ‘@’ character. In the same way, tables and namespaces are specified, but namespaces are prefixed with an ‘@’ character. For example: hbase> grant ‘bobsmith’, ‘RWXCA’ hbase> grant ‘@admins’, ‘RWXCA’ hbase> grant ‘bobsmith’, ‘RWXCA’, ‘@ns1’ hbase> grant ‘bobsmith’, ‘RW’, ‘t1’, ‘f1’, ‘col1’ hbase> grant ‘bobsmith’, ‘RW’, ‘ns1:t1’, ‘f1’, ‘col1’ list_security_capabilitiesList supported security capabilities Example: hbase> list_security_capabilities revokeRevoke a user’s access rights. Syntax : revoke [, <@namespace> [, [, [, ]]]] Note: Groups and users access are revoked in the same way, but groups are prefixed with an ‘@’ character. In the same way, tables and namespaces are specified, but namespaces are prefixed with an ‘@’ character. For example: hbase> revoke ‘bobsmith’ hbase> revoke ‘@admins’ hbase> revoke ‘bobsmith’, ‘@ns1’ hbase> revoke ‘bobsmith’, ‘t1’, ‘f1’, ‘col1’ hbase> revoke ‘bobsmith’, ‘ns1:t1’, ‘f1’, ‘col1’ user_permissionShow all permissions for the particular user. Syntax : user_permission Note: A namespace must always precede with ‘@’ character. For example: hbase> user_permission hbase> user_permission ‘@ns1’ hbase> user_permission ‘@.*’ hbase> user_permission ‘@^[a-c].*’ hbase> user_permission ‘table1’ hbase> user_permission ‘namespace1:table1’ hbase> user_permission ‘.*’ hbase> user_permission ‘^[A-C].*’ |
HBase Security Syntax and Usage
Note: Security commands are only applicable if running with the AccessController coprocessor. These HBase shell commands are mostly used by an admin to make & provide security to the database and tables.
PAIR RDD ACTION FUNCTIONS | FUNCTION DESCRIPTION |
---|---|
collectAsMap | Returns the pair RDD as a Map to the Spark Master. |
countByKey | Returns the count of each key elements. This returns the final result to local Map which is your driver. |
countByKeyApprox | Same as countByKey but returns the partial result. This takes a timeout as parameter to specify how long this function to run before returning. |
lookup | Returns a list of values from RDD for a given input key. |
reduceByKeyLocally | Returns a merged RDD by merging the values of each key and final result will be sent to the master. |
saveAsHadoopDataset | Saves RDD to any hadoop supported file system (HDFS, S3, ElasticSearch, e.t.c), It uses Hadoop JobConf object to save. |
saveAsHadoopFile | Saves RDD to any hadoop supported file system (HDFS, S3, ElasticSearch, e.t.c), It uses Hadoop OutputFormat class to save. |
saveAsNewAPIHadoopDataset | Saves RDD to any hadoop supported file system (HDFS, S3, ElasticSearch, e.t.c) with new Hadoop API, It uses Hadoop Configuration object to save. |
saveAsNewAPIHadoopFile | Saves RDD to any hadoop supported fule system (HDFS, S3, ElasticSearch, e.t.c), It uses new Hadoop API OutputFo |
Procedure Syntax and Usage
COMMANDS | USAGE & EXAMPLES |
---|---|
add_rsgroup | Create a new RegionServer group. Example: hbase> add_rsgroup ‘my_group’ |
balance_rsgroup | Balance a RegionServer group Example: hbase> balance_rsgroup ‘my_group’ |
get_rsgroup | Get a RegionServer group’s information. Example: hbase> get_rsgroup ‘default’ |
get_server_rsgroup | Get the group name the given RegionServer is a member of. Example: hbase> get_server_rsgroup ‘server1:port1’ |
get_table_rsgroup | Get the RegionServer group name the given table is a member of. Example: hbase> get_table_rsgroup ‘myTable’ |
list_rsgroups | List all RegionServer groups. Optional regular expression parameter can be used to filter the output. Example: hbase> list_rsgroups hbase> list_rsgroups ‘abc.*’ |
move_servers_rsgroup | Reassign a region server from one RSGroup to another. hbase> move_servers_rsgroup ‘dest’,[‘server1:port’,’server2:port’] |
move_tables_rsgroup | Reassign tables from one RSGroup to another. hbase> move_tables_rsgroup ‘dest’,[‘table1′,’table2’] |
remove_rsgroup | Remove a RegionServer group. hbase> remove_rsgroup ‘my_group’ |
Visibility Label Syntax and Usage
ARRAY FUNCTION SYNTAX | ARRAY FUNCTION DESCRIPTION |
---|---|
array_contains(column: Column, value: Any) | Check if a value presents in an array column. Return below values. true – Returns if value presents in an array. false – When valu eno presents. null – when array is null. |
array_distinct(e: Column) | Return distinct values from the array after removing duplicates. |
array_except(col1: Column, col2: Column) | Returns all elements from col1 array but not in col2 array. |
array_intersect(col1: Column, col2: Column) | Returns all elements that are present in col1 and col2 arrays. |
array_join(column: Column, delimiter: String, nullReplacement: String) array_join(column: Column, delimiter: String) | Concatenates all elments of array column with using provided delimeter. When Null valeus are present, they replaced with ‘nullReplacement’ string |
array_max(e: Column) | Return maximum values in an array |
array_min(e: Column) | Return minimum values in an array |
array_position(column: Column, value: Any) | Returns a position/index of first occurrence of the ‘value’ in the given array. Returns position as long type and the position is not zero based instead starts with 1. Returns zero when value is not found. Returns null when any of the arguments are null. |
array_remove(column: Column, element: Any) | Returns an array after removing all provided ‘value’ from the given array. |
array_repeat(e: Column, count: Int) | Creates an array containing the first argument repeated the number of times given by the second argument. |
array_repeat(left: Column, right: Column) | Creates an array containing the first argument repeated the number of times given by the second argument. |
array_sort(e: Column) | Returns the sorted array of the given input array. All null values are placed at the end of the array. |
array_union(col1: Column, col2: Column) | Returns an array of elements that are present in both arrays (all elements from both arrays) with out duplicates. |
arrays_overlap(a1: Column, a2: Column) | true – if `a1` and `a2` have at least one non-null element in common false – if `a1` and `a2` have completely different elements. null – if both the arrays are non-empty and any of them contains a `null` |
arrays_zip(e: Column*) | Returns a merged array of structs in which the N-th struct contains all N-th values of input |
concat(exprs: Column*) | Concatenates all elements from a given columns |
element_at(column: Column, value: Any) | Returns an element of an array located at the ‘value’ input position. |
exists(column: Column, f: Column => Column) | Checks if the column presents in an array column. |
explode(e: Column) | Create a row for each element in the array column |
explode_outer ( e : Column ) | Create a row for each element in the array column. Unlike explode, if the array is null or empty, it returns null. |
filter(column: Column, f: Column => Column) filter(column: Column, f: (Column, Column) => Column) | Returns an array of elements for which a predicate holds in a given array |
flatten(e: Column) | Creates a single array from an array of arrays column. |
forall(column: Column, f: Column => Column) | Returns whether a predicate holds for every element in the array. |
posexplode(e: Column) | Creates a row for each element in the array and creaes a two columns “pos’ to hold the position of the array element and the ‘col’ to hold the actual array value. |
posexplode_outer(e: Column) | Creates a row for each element in the array and creaes a two columns “pos’ to hold the position of the array element and the ‘col’ to hold the actual array value. Unlike posexplode, if the array is null or empty, it returns null,null for pos and col columns. |
reverse(e: Column) | Returns the array of elements in a reverse order. |
sequence(start: Column, stop: Column) | Generate the sequence of numbers from start to stop number. |
sequence ( start : Column , stop : Column , step : Column ) | Generate the sequence of numbers from start to stop number by incrementing with given step value. |
shuffle(e: Column) | Shuffle the given array |
size(e: Column) | Return the length of an array. |
slice(x: Column, start: Int, length: Int) | Returns an array of elements from position ‘start’ and the given length. |
sort_array(e: Column) | Sorts the array in an ascending order. Null values are placed at the beginning. |
sort_array(e: Column, asc: Boolean) | Sorts the array in an ascending or descending order based of the boolean parameter. For assending, Null values are placed at the beginning. And for desending they are places at the end. |
transform(column: Column, f: Column => Column) transform(column: Column, f: (Column, Column) => Column) | Returns an array of elments after applying transformation. |
zip_with(left: Column, right: Column, f: (Column, Column) => Column) | Merges two input arrays. |
aggregate( expr: Column, zero: Column, merge: (Column, Column) => Column, finish: Column => Column) | Aggregates |
Rsgroup HBase Shell Commands
Note: The rsgroup Coprocessor Endpoint must be enabled on the Master else commands fail with:
UnknownProtocolException: No registered Master Coprocessor Endpoint found for RSGroupAdminService
COMMANDS | USAGE & EXAMPLES |
---|---|
assign | Assign a region. Use with caution. If region already assigned, this command will do a force reassign. For experts only. Examples: hbase> assign ‘REGIONNAME’ hbase> assign ‘ENCODED_REGIONNAME’ |
balance_switch | Enable/Disable balancer. Returns previous balancer state. Examples: hbase> balance_switch true hbase> balance_switch false |
balancer | Trigger the cluster balancer. Returns true if balancer ran and was able to tell the region servers to unassign all the regions to balance (the re-assignment itself is async). Otherwise false (Will not run if regions in transition). Command: balancer_enabled Query the balancer’s state. Examples: hbase> balancer_enabled |
catalogjanitor_enabled | Query for the CatalogJanitor state (enabled/disabled?) Examples: hbase> catalogjanitor_enabled |
catalogjanitor_run | Catalog janitor command to run the (garbage collection) scan from command line. hbase> catalogjanitor_run |
catalogjanitor_switch | Enable/Disable CatalogJanitor. Returns previous CatalogJanitor state. Examples: hbase> catalogjanitor_switch true hbase> catalogjanitor_switch false |
close_region | Close a single region. Ask the master to close a region out on the cluster or if ‘SERVER_NAME’ is supplied, ask the designated hosting regionserver to close the region directly. Closing a region, the master expects ‘REGIONNAME’ to be a fully qualified region name. When asking the hosting regionserver to directly close a region, you pass the regions’ encoded name only. A region name looks like this: TestTable,0094429456,1289497600452.527db22f95c8a9e0116f0cc13c680396. or Namespace:TestTable,0094429456,1289497600452.527db22f95c8a9e0116f0cc13c680396. The trailing period is part of the regionserver name. A region’s encoded name is the hash at the end of a region name; e.g. 527db22f95c8a9e0116f0cc13c680396 (without the period). A ‘SERVER_NAME’ is its host, port plus startcode. For example: host187.example.com,60020,1289493121758 (find servername in master ui or when you do detailed status in shell). This command will end up running close on the region hosting regionserver. The close is done without the master’s involvement (It will not know of the close). Once closed, region will stay closed. Use assign to reopen/reassign. Use unassign or move to assign the region elsewhere on cluster. Use with caution. For experts only. Examples: hbase> close_region ‘REGIONNAME’ hbase> close_region ‘REGIONNAME’, ‘SERVER_NAME’ hbase> close_region ‘ENCODED_REGIONNAME’ hbase> close_region ‘ENCODED_REGIONNAME’, ‘SERVER_NAME’ |
compact | Compact all regions in passed table or pass a region row to compact an individual region. You can also compact a single column family within a region. Examples: Compact all regions in a table: hbase> compact ‘ns1:t1’ hbase> compact ‘t1’ Compact an entire region: hbase> compact ‘r1’ Compact only a column family within a region: hbase> compact ‘r1’, ‘c1’ Compact a column family within a table: hbase> compact ‘t1’, ‘c1’ |
compact_mob | Run compaction on a mob enabled column family or all mob enabled column families within a table Examples: Compact a column family within a table: hbase> compact_mob ‘t1’, ‘c1’ Compact all mob enabled column families hbase> compact_mob ‘t1’ |
compact_rs | Compact all regions on passed regionserver. Examples: Compact all regions on a regionserver: hbase> compact_rs ‘host187.example.com,60020’ or hbase> compact_rs ‘host187.example.com,60020,1289493121758’ Major compact all regions on a regionserver: hbase> compact_rs ‘host187.example.com,60020,1289493121758’, true |
flush | Flush all regions in passed table or pass a region row to flush an individual region. For example: hbase> flush ‘TABLENAME’ hbase> flush ‘REGIONNAME’ hbase> flush ‘ENCODED_REGIONNAME’ |
major_compact | Run major compaction on passed table or pass a region row to major compact an individual region. To compact a single column family within a region specify the region name followed by the column family name. Examples: Compact all regions in a table: hbase> major_compact ‘t1’ hbase> major_compact ‘ns1:t1’ Compact an entire region: hbase> major_compact ‘r1’ Compact a single column family within a region: hbase> major_compact ‘r1’, ‘c1’ Compact a single column family within a table: hbase> major_compact ‘t1’, ‘c1’ |
major_compact_mob | Run major compaction on a mob enabled column family or all mob enabled column families within a table Examples: Compact a column family within a table: hbase> major_compact_mob ‘t1’, ‘c1’ Compact all mob enabled column families within a table hbase> major_compact_mob ‘t1’ |
merge_region | Merge two regions. Passing ‘true’ as the optional third parameter will force a merge (‘force’ merges regardless else merge will fail unless passed adjacent regions. ‘force’ is for expert use only). NOTE: You must pass the encoded region name, not the full region name so this command is a little different from other region operations. The encoded region name is the hash suffix on region names: e.g. if the region name were TestTable,0094429456,1289497600452.527db22f95c8a9e0116f0cc13c680396. then the encoded region name portion is 527db22f95c8a9e0116f0cc13c680396 Examples: hbase> merge_region ‘ENCODED_REGIONNAME’, ‘ENCODED_REGIONNAME’ hbase> merge_region ‘ENCODED_REGIONNAME’, ‘ENCODED_REGIONNAME’, true |
move | Move a region. Optionally specify target regionserver else we choose one at random. NOTE: You pass the encoded region name, not the region name so this command is a little different to the others. The encoded region name is the hash suffix on region names: e.g. if the region name were TestTable,0094429456,1289497600452.527db22f95c8a9e0116f0cc13c680396. then the encoded region name portion is 527db22f95c8a9e0116f0cc13c680396 A server name is its host, port plus startcode. For example: host187.example.com,60020,1289493121758 Examples: hbase> move ‘ENCODED_REGIONNAME’ hbase> move ‘ENCODED_REGIONNAME’, ‘SERVER_NAME’ |
normalize | Trigger region normalizer for all tables which have NORMALIZATION_ENABLED flag set. Returns true if normalizer ran successfully, false otherwise. Note that this command has no effect if region normalizer is disabled (make sure it’s turned on using ‘normalizer_switch’ command). Examples: hbase> normalize |
normalizer_enabled | Query the state of region normalizer. Examples: hbase> normalizer_enabled |
normalizer_switch | Enable/Disable region normalizer. Returns previous normalizer state. When normalizer is enabled, it handles all tables with ‘NORMALIZATION_ENABLED’ => true. Examples: hbase> normalizer_switch true hbase> normalizer_switch false |
split | Split entire table or pass a region to split individual region. With the second parameter, you can specify an explicit split key for the region. Examples: split ‘tableName’ split ‘namespace:tableName’ split ‘regionName’ # format: ‘tableName,startKey,id’ split ‘tableName’, ‘splitKey’ split ‘regionName’, ‘splitKey’ |
trace | Start or Stop tracing using HTrace. Always returns true if tracing is running, otherwise false. If the first argument is ‘start’, new span is started. If the first argument is ‘stop’, current running span is stopped. (‘stop’ returns false on success.) If the first argument is ‘status’, just returns if or not tracing is running. On ‘start’-ing, you can optionally pass the name of span as the second argument. The default name of span is ‘HBaseShell’. Repeating ‘start’ does not start nested span. Examples: hbase> trace ‘start’ hbase> trace ‘status’ hbase> trace ‘stop’ hbase> trace ‘start’, ‘MySpanName’ hbase> trace ‘stop’ |
unassign | Unassign a region. Unassign will close region in current location and then reopen it again. Pass ‘true’ to force the unassignment (‘force’ will clear all in-memory state in master before the reassign. If results in double assignment use hbck -fix to resolve. To be used by experts). Use with caution. For expert use only. Examples: hbase> unassign ‘REGIONNAME’ hbase> unassign ‘REGIONNAME’, true hbase> unassign ‘ENCODED_REGIONNAME’ hbase> unassign ‘ENCODED_REGIONNAME’, true |
wal_roll | Roll the log writer. That is, start writing log messages to a new file. The name of the regionserver should be given as the parameter. A ‘server_name’ is the host, port plus startcode of a regionserver. For example: host187.example.com,60020,1289493121758 (find servername in master ui or when you do detailed status in shell) |
zk_dump | Dump status of HBase cluster as seen by ZooKee |
Conclusion:
We have seen HBase shell commands are broken down into several different groups, each serves a different purpose and also have seen examples, usage, and description of each command to interact with HBase. I hope it helps !!
Happy Learning !!
Related Articles
- HBase – Scan or Select the table
- HBase – Exists and Count with examples
- HBase Disable and Enable Table with Examples
- HBase – Delete Rows with Examples
- HBase Scan to Filter Rows like Where Clause
- HBase – Get rows from Table with Examples
- HBase – List Tables with examples
- HBase Describe table with Examples