Hive – Start HiveServer2 & Connect Beeline

  • Post author:
  • Post category:Apache Hive

In this article, I will explain what is HiveServer2, how to start HiveServer2, and connect to Hive using the Beeline command interface.

Prerequisites: Have Hive installed and setup to run on Hadoop cluster.

HiveServer2 a.k.a HS2 is a second-generation hive server that enables

  • Remote clients to execute queries against the Hive server.
  • Multi-client concurrency and authentication
  • Better supports for API client like JDBC and ODBC

HiveServer2 is the second generation of the Hive server, first being HiveServer1 which has been deprecated and will be removed in future versions of Hive.

Before you proceed starting HiveServer2, make sure you have created the Hive Metastore and data warehouse location and able to run Hive CLI.

Start HiverServer2

Hive distribution comes with hiveserver2 which is located at $HIVE_HOME/bin/ directory, run this command without any arguments to start the hiveserver2.


[email protected]:~/hive$ $HIVE_HOME/bin/hiveserver2
hive start hiveserver2

In Order to run it as a service run the same command as nohup $HIVE_HOME/bin/hiveserver2 &. This creates a nohup.out file which contains the log.

You can also start Hive server HS2 (HiveServer2) using hive --service command.


[email protected]:~/hive$ $HIVE_HOME/bin/hive --service hiveserver2
(or)
# Start in nohup mode
[email protected]:~/hive$ nohup $HIVE_HOME/bin/hive --service hiveserver2 & 

By default HiveServer2 runs on port 10000, If you wanted to change the port, you can do it by changing the value for hive.server2.thrift.port  property on $HIVE_HOME/conf/hive-site.xml file.

Check if the HiveServer2 service is running and listening on port 10000 using netstat command.


[email protected]:~/hive$ netstat -anp | grep 10000
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp6       0      0 :::10000                :::*                    LISTEN      17820/java

Since Hive is written in Java, you can also use jps command to check HiveServer2 is running.


[email protected]:~/hive$ jps
18025 Jps
17820 RunJar

HiveServer2 Web UI

HiveServer2 also starts a Jetty Http server on port 1002 which provides you with Web UI. This Web User Interface (UI) for HiveServer2 provides configuration, logging, metrics, and active session information. The Web UI is available at port 10002 (127.0.0.1:10002) by default.  

Hive start hiveserver2
HiveServer2 Web UI

Connect Beeline

HiveServer2 supports a command shell Beeline that works with HiveServer2. It’s a JDBC client that is based on the SQLLine CLI. beeline is located at $HIVE_HOME/bin directory.


[email protected]:~/hive$ bin/beeline
Beeline version 2.3.7 by Apache Hive
beeline>

Now enter the beeline command !connect as shown below. All beeline commands start with !.

Hive by default provides user scott and password tiger.


beeline>!connect jdbc:hive2://127.0.0.1:10000 scott tiger

If you get a connection error User: is not allowed to impersonate, as shown below. set property hive.server2.enable.doAs to false on $HIVE_HOME/conf/hive-site.xml file. and, restart the HiveServer2 and try to run the beeline command again

Now, you should see a message like below with JDBC beeline prompt.


beeline> !connect jdbc:hive2://127.0.0.1:10000 scott tiger
Connecting to jdbc:hive2://127.0.0.1:10000
Connected to: Apache Hive (version 3.1.2)
Driver: Hive JDBC (version 3.1.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://127.0.0.1:10000>

In case if you still having connectivity issues, try with actual IP or host instead of 127.0.0.1 address.

Upon successfully staring beeline, enter the show databases; to list the database, by default hive provides default database.


0: jdbc:hive2://127.0.0.1:10000> show databases;
+----------------+
| database_name  |
+----------------+
| default        |
+----------------+
1 rows selected (1.775 seconds)
0: jdbc:hive2://127.0.0.1:10000>

You can also directly issues a beeline command from unix shell, For more command option please refer beeline options


[email protected]:~/hive$ bin/beeline -u jdbc:hive2://127.0.0.1:10000 scott tiger

By enabling and using a beeline for a production environment, the user doesn’t need explicit access to Hive Metastore and HDFS where data warehouse directory located.

The Beeline shell works in both embedded mode as well as remote mode. In the embedded mode, it runs an embedded Hive (similar to Hive CLI) whereas remote mode is for connecting to a separate HiveServer2 process over Thrift.

Hope you like this article.

Happy Learning !

NNK

SparkByExamples.com is a BigData and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment using Scala and Maven.

Leave a Reply