• Post author:
  • Post category:Apache Hive
  • Post last modified:March 2, 2024
  • Reading time:8 mins read
You are currently viewing Hive – Create Database Examples

Using CREATE DATABASE statement you can create a new Database in Hive, like any other RDBMS Databases, the Hive database is a namespace to store the tables.

Advertisements

In this article, I will explain how to create a database, its syntax, and usage with examples in hive shell, Java and Scala languages.

Hive by default contains a default database, you can get this using SHOW DATABASES; statement.

Note: when you running SQL statements on Hive or beeline shell terminal, the statement should end with a semicolon.

Hive create database

Hive Create Database Syntax

In Hive, CREATE DATABASE statement is used to create a Database, this takes an optional clause IF NOT EXISTS, using this option, it creates only when database not already exists.


CREATE DATABASE [IF NOT EXISTS] <database_name>

Note: Creating a database with already existing name in a database returns an error.

Create Database from Hive Beeline shell


jdbc:hive2://> CREATE DATABASE emp;
OK
Time taken: 0.929 seconds
jdbc:hive2://>

Make sure the database you are creating doesn’t exist on Hive warehouse, if exists it throws Database emp already exists error.


jdbc:hive2://> CREATE DATABASE emp;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Database emp already exists

If you wanted to check if the database already exists before creating, use IF NOT EXISTS clause.


jdbc:hive2://> CREATE DATABASE IF NOT EXISTS emp;
OK
Time taken: 0.063 seconds
jdbc:hive2://>

Hive by default creates a directory with a database name .db extension (for example emp.db) under its warehouse directory. By default warehouse directory located is /user/hive/warehouse on hdfs.


prabha@prabha:~$ hdfs dfs -ls /user/hive/warehouse
Found 1 items
drwxr-xr-x   - prabha supergroup          0 2020-10-03 21:30 /user/hive/warehouse/emp.db

You can change the location of the database where to create by using any of the below commands


jdbc:hive2://>CREATE DATABASE temp LOCATION '/apps/project/hive/warehouse';

You can also change the default location using hive.metastore.warehouse.dir


jdbc:hive2://>set hive.metastore.warehouse.dir='/apps/project/hive/warehouse';
jdbc:hive2://>CREATE DATABASE temp;

Show Database

In order to show the existing databases on Hive, use the SHOW DATABASES command.


jdbc:hive2://>> SHOW DATABASES;
OK
default
emp
Time taken: 0.059 seconds, Fetched: 2 row(s)

Creating a Database from Java & Scala

Let’s see how to create a Database from Java & Scala program, In order to connect and run Hive SQL you need to have hive-jdbc dependency and Hive JDBC connection string. You can download dependency from Maven or use the below dependency on your pom.xml.


<dependency>
    <groupId>org.apache.hive</groupId>
    <artifactId>hive-jdbc</artifactId>
    <version>3.1.2</version>
</dependency>

Below are complete Java and Scala examples of how to create a Database.

Note: If you are using an older version of Hive, you should use the driver org.apache.hadoop.hive.jdbc.HiveDriver and your connection string should be jdbc:hive://

In the below Java and Scala examples, I am connecting to the Hive default database and creating a new Hive database emp.

Java Example


package com.sparkbyexamples.hive;

import java.sql.Connection;
import java.sql.Statement;
import java.sql.DriverManager;

public class HiveCreateDatabase {
	public static void main(String[] args) {
		Connection con = null;
		try {
			String conStr = "jdbc:hive2://192.168.1.148:10000/default";
			Class.forName("org.apache.hive.jdbc.HiveDriver");
			con = DriverManager.getConnection(conStr, "", "");
			Statement stmt = con.createStatement();
			stmt.executeQuery("CREATE DATABASE emp");
			System.out.println("Database emp created successfully.");
		} catch (Exception ex) {
			ex.printStackTrace();
		} finally {
			try {
				if (con != null)
					con.close();
			} catch (Exception ex) {
			}
		}
	}
}

Scala Example


package com.sparkbyexamples.hive;

import java.sql.Connection;
import java.sql.Statement;
import java.sql.DriverManager;

object HiveCreateDatabase extends App{
	var con = null;
	try {
		val conStr = "jdbc:hive2://192.168.1.148:10000/default";
		Class.forName("org.apache.hive.jdbc.HiveDriver");
		con = DriverManager.getConnection(conStr, "", "");
		val stmt = con.createStatement();
		stmt.executeQuery("CREATE DATABASE emp");
		System.out.println("Database emp created successfully.");
	} catch (Exception ex) {
		ex.printStackTrace();
	} finally {
		try {
			if (con != null)
				con.close();
		} catch (Exception ex) {
		}
	}
}

Conclusion

In this article, you have learned creating a Database using hive shell, java, Scala program and also learned it by default stores a database at /user/hive/warehouse directory path and finally learned how to change the default location path.

Happy Learning !!

Prabha

Prabha is an accomplished data engineer with a wealth of experience in architecting, developing, and optimizing data pipelines and infrastructure. With a strong foundation in software engineering and a deep understanding of data systems, Prabha excels in building scalable solutions that handle diverse and large datasets efficiently. At SparkbyExamples.com Prabha writes her experience in Spark, PySpark, Python and Pandas.

Leave a Reply

This Post Has One Comment

  1. Anonymous

    Nice and clear explanation, Thanks for the content. Please post more content in hive like SCD1, SCD2, SCD3 and hive performance tuning techniques so on…