In this article, I will explain what is HiveServer2, how to start HiveServer2, and connect to Hive using the Beeline command interface.
Prerequisites: Have Hive installed and setup to run on Hadoop cluster.
HiveServer2 a.k.a HS2 is a second-generation hive server that enables
- Remote clients to execute queries against the Hive server.
- Multi-client concurrency and authentication
- Better supports for API client like JDBC and ODBC
HiveServer2 is the second generation of the Hive server, first being HiveServer1 which has been deprecated and will be removed in future versions of Hive.
Before you proceed starting HiveServer2, make sure you have created the Hive Metastore and data warehouse location and able to run
Hive distribution comes with
hiveserver2 which is located at $HIVE_HOME/bin/ directory, run this command without any arguments to start the hiveserver2.
[email protected]:~/hive$ $HIVE_HOME/bin/hiveserver2
In Order to run it as a service run the same command as
nohup $HIVE_HOME/bin/hiveserver2 &. This creates a
nohup.out file which contains the log.
You can also start Hive server HS2 (HiveServer2) using
hive --service command.
[email protected]:~/hive$ $HIVE_HOME/bin/hive --service hiveserver2 (or) # Start in nohup mode [email protected]:~/hive$ nohup $HIVE_HOME/bin/hive --service hiveserver2 &
By default HiveServer2 runs on port 10000, If you wanted to change the port, you can do it by changing the value for
hive.server2.thrift.port property on
Check if the HiveServer2 service is running and listening on port 10000 using
[email protected]:~/hive$ netstat -anp | grep 10000 (Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.) tcp6 0 0 :::10000 :::* LISTEN 17820/java
Since Hive is written in Java, you can also use
jps command to check HiveServer2 is running.
[email protected]:~/hive$ jps 18025 Jps 17820 RunJar
HiveServer2 Web UI
HiveServer2 also starts a Jetty Http server on port 1002 which provides you with Web UI. This Web User Interface (UI) for HiveServer2 provides configuration, logging, metrics, and active session information. The Web UI is available at port 10002 (127.0.0.1:10002) by default.
HiveServer2 supports a command shell Beeline that works with HiveServer2. It’s a JDBC client that is based on the SQLLine CLI.
beeline is located at $HIVE_HOME/bin directory.
[email protected]:~/hive$ bin/beeline Beeline version 2.3.7 by Apache Hive beeline>
Now enter the beeline command !connect as shown below. All beeline commands start with !.
Hive by default provides user scott and password tiger.
beeline>!connect jdbc:hive2://127.0.0.1:10000 scott tiger
If you get a connection error User: is not allowed to impersonate, as shown below. set property
$HIVE_HOME/conf/hive-site.xml file. and, restart the HiveServer2 and try to run the beeline command again
Now, you should see a message like below with JDBC beeline prompt.
beeline> !connect jdbc:hive2://127.0.0.1:10000 scott tiger Connecting to jdbc:hive2://127.0.0.1:10000 Connected to: Apache Hive (version 3.1.2) Driver: Hive JDBC (version 3.1.2) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://127.0.0.1:10000>
In case if you still having connectivity issues, try with actual IP or host instead of 127.0.0.1 address.
Upon successfully staring beeline, enter the
show databases; to list the database, by default hive provides
0: jdbc:hive2://127.0.0.1:10000> show databases; +----------------+ | database_name | +----------------+ | default | +----------------+ 1 rows selected (1.775 seconds) 0: jdbc:hive2://127.0.0.1:10000>
You can also directly issues a
beeline command from unix shell, For more command option please refer beeline options
[email protected]:~/hive$ bin/beeline -u jdbc:hive2://127.0.0.1:10000 scott tiger
By enabling and using a beeline for a production environment, the user doesn’t need explicit access to Hive Metastore and HDFS where data warehouse directory located.
The Beeline shell works in both embedded mode as well as remote mode. In the embedded mode, it runs an embedded Hive (similar to Hive CLI) whereas remote mode is for connecting to a separate HiveServer2 process over Thrift.
Hope you like this article.
Happy Learning !