You are currently viewing Connect to Hive using JDBC connection

Hive provides a JDBC connection URL string jdbc:hive2://ip-address:port to connect to Hive warehouse from remote applications running with Java, Scala, Python, Spark and many more.

In this article, I will explain how to connect to Hive from Java and Scala using JDBC connection URL string and maven dependency hive-jdbc

Hive JDBC Connection URL

Similar to other databases Hive also supports JDBC connection URL string jdbc:hive2://ip-address:port to connect to Hive from applications running remotely.


jdbc:hive2://192.168.1.148:10000/default

If you are using an older version of Hive your connection string should be jdbc:hive://


jdbc:hive://

Hive JDBC Maven Dependency

In order to connect to Hive from Java & Scala program and run HiveQL you need to have <a href="https://mvnrepository.com/artifact/org.apache.hive/hive-jdbc">hive-jdbc</a> library as a dependency for maven or Gradel. For Maven, use the below artifact on your pom.xml.


<dependency>
    <groupId>org.apache.hive</groupId>
    <artifactId>hive-jdbc</artifactId>
    <version>3.1.2</version>
</dependency>

Use the artifact version according to Hive version you are using.

Use new driver class org.apache.hive.jdbc.HiveDriver, which works with HiveServer2.

Note: If you are using an older version of Hive, you should use the driver org.apache.hadoop.hive.jdbc.HiveDriver and your connection string should be jdbc:hive://

Start HiveServer2

To connect from Java, Scala, Python, or from any programming language remotely, you need to Start HiveServer2 service from $HIVE_HOME/bin directory.


prabha@namenode:~/hive/bin$ ./hiveserver2
2020-10-03 23:17:08: Starting HiveServer2

Accessing Hive from Java

Below is a complete example of accessing Hive from Java using JDBC URL string and JDBC drive.

This example connects to default database comes with Hive and shows the databases in the Hive.


package com.sparkbyexamples.hive;

import java.sql.Connection;
import java.sql.Statement;
import java.sql.DriverManager;

public class HiveJDBCConnect {
	public static void main(String[] args) {
		Connection con = null;
		try {
			String conStr = "jdbc:hive2://192.168.1.148:10000/default";
			Class.forName("org.apache.hive.jdbc.HiveDriver");
			con = DriverManager.getConnection(conStr, "", "");
			Statement stmt = con.createStatement();
			stmt.executeQuery("show databases");
			System.out.println("show database successfully.");
		} catch (Exception ex) {
			ex.printStackTrace();
		} finally {
			try {
				if (con != null)
					con.close();
			} catch (Exception ex) {
			}
		}
	}
}

In high level above example does the following.

  • Class.forName() loads the specified Hive driver org.apache.hive.jdbc.HiveDriver, this driver is present in hive-jdbc library.
  • DriverManager.getConnection() takes JDBC connection string jdbc:hive2://192.168.1.148:10000/default and returns Connection object.
  • Get the Statement object from createStatement() of con (connection) obj
  • stmt.executeQuery("query") – executes the query.

Accessing Hive from Scala

Below is a complete example accessing Hive from Scala using JDBC URL string and driver.


package com.sparkbyexamples.hive;

import java.sql.Connection;
import java.sql.Statement;
import java.sql.DriverManager;

object HiveJDBCConnect extends App{
	var con = null;
	try {
		val conStr = "jdbc:hive2://192.168.1.148:10000/default";
		Class.forName("org.apache.hive.jdbc.HiveDriver");
		con = DriverManager.getConnection(conStr, "", "");
		val stmt = con.createStatement();
		stmt.executeQuery("Show databases");
		System.out.println("show database successfully");
	} catch (Exception ex) {
		ex.printStackTrace();
	} finally {
		try {
			if (con != null)
				con.close();
		} catch (Exception ex) {
		}
	}
}

Conclusion

Here you have learned by starting HiveServer2 you can connect to Hive from remove services using JDBC connection URL string and learned how to connect to Hive from Java and Scala languages.

Happy Learning !!

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium

Leave a Reply

This Post Has One Comment

  1. PAXI

    example Python and Pyspark pls