• Post author:
  • Post category:Apache Hive
  • Post last modified:January 5, 2023
  • Reading time:8 mins read
You are currently viewing Hive Temporary Table Usage And How to Create?

Using CREATE TEMPORARY TABLE statement we can create a temporary table in Hive which is used to store the data temporarily within an active session and the temporary tables get automatically removed when the active session end.

In this article, I will explain what is a temporary table, how to create one, and what are the differences between regular vs temporary table and their limitations.

Table of Contents

1. What is Hive Temporary Tables?

Hive temporary tables are similar to temporary tables that exist in SQL Server or any RDBMS databases, As the name suggests these tables are created temporarily within an active session.

Usually, temporary tables are created at the run time to store the intermediate data that are used to perform further data processing. once the processing is done either you can explicitly drop the temporary table or session termination will drop these tables.

Temporary tables don’t store data in the Hive warehouse directory instead the data get stored in the user’s scratch directory /tmp/hive/<user>/* on HDFS.

If you create a temporary table in Hive with the same name as a permanent table that already exists in the database, then within that session any references to that permanent table will resolve to the temporary table, rather than to the permanent table.

2. Hive Temporary Table vs Regular Table

Below are the differences between Hive Temporary table vs Regular permanent table

Temporary TableRegular Table (Internal/External)
Creates a table within a sessionCreates globally
Can be accessed only from a session it createdTable could be accessed from different session right after created
Stores at users scratch directory /tmp/hive/<user>/*Stores at Hive warehouse directory /user/hive/warehouse
Automatically removed when session terminatedPersist until explicitly dropped
Doesn’t support partitionsSupports partitions
Indexes cannot be createdYou can create Indexes

3. Hive Create a Temporary Table

By using CREATE TEMPORARY TABLE you can create a temporary table in Hive, The TEMPORARY a clause is optional, not using it will not return an error instead it creates a regular managed table.

3.1 Hive Create Temporary Table Examples

3.1.1 Below is a simple example of creating a temporary table.


CREATE TEMPORARY TABLE emp.employee_tmp (
 id int,
 name string,
 age int,
 gender string)
 ROW FORMAT DELIMITED
 FIELDS TERMINATED BY ',';

DESCRIBE emp.employee_tmp returns the following.

hive create temporary table

3.1.2 Create a temporary table using LIKE existing table without copying data.


CREATE TEMPORARY TABLE emp.similar_tmp LIKE emp.employee;

3.1.3 Creating a temporary table from the results of the select query.


CREATE TEMPORARY TABLE emp.filter_tmp AS SELECT id,name FROM emp.employee WHERE gender = 'F';

3.1.4 Creating temporary external table


 CREATE TEMPORARY external TABLE emp.employee_tmp2(id int);

3.2 Loading Files to a Temporary Table


LOAD DATA INPATH '/user/hive/data/data.txt' INTO TABLE emp.employee_tmp;

3.3 Insert Data into Temporary Table

Like SQL, you can also use INSERT INTO to insert rows into table.


INSERT INTO emp.employee VALUES (7,'scott',23,'M');
INSERT INTO emp.employee VALUES (8,'raman',50,'M');

4. Hive Drop Temporary Table

Use DROP TABLE statement to drop a temporary table. Below is an example of how to drop a temporary table.


DROP TABLE IF NOT EXISTS emp.employee_temp

5. Hive Temporary Table Limitations

As mentioned in the differences, Hive temporary table have few limitation compared with regular tables.

  • On temporary tables, you cannot create partitions.
  • Indexes are not supported on temporary tables
  • If you create a temporary table name same as the permanent table name, you cannot access the permanent table until you drop a temporary table or rename it to a different name.

Related Articles

Reference

Hope you like it.

Happy Learning !!

Prabha

Prabha is an accomplished data engineer with a wealth of experience in architecting, developing, and optimizing data pipelines and infrastructure. With a strong foundation in software engineering and a deep understanding of data systems, Prabha excels in building scalable solutions that handle diverse and large datasets efficiently. At SparkbyExamples.com Prabha writes her experience in Spark, PySpark, Python and Pandas.

Leave a Reply