Why Hive Table is loading with NULL values?

  • Post author:

Problem: I created a new managed table in Hive and tried loading the data to the table, post LOAD command, select shows all NULL values as the data got loaded NULL for all rows and columns.

I’ve created a Hive table as shown below.


jdbc:hive2://127.0.0.1:10000> CREATE TABLE IF NOT EXISTS emp.employee (
 . . . . . . . . . . . . . .> id int,
 . . . . . . . . . . . . . .> name string,
 . . . . . . . . . . . . . .> age int,
 . . . . . . . . . . . . . .> gender string )
 . . . . . . . . . . . . . .> COMMENT 'Employee Table';

I’ve a data file with comma separated fields, upload the data file (data.txt) to HDFS and loaded the file using below command.


LOAD DATA INPATH '/user/hive/data/data.txt' INTO table emp.employee;

selecting the table returns the following.


+--------------+----------------+---------------+------------------+
| employee.id  | employee.name  | employee.age  | employee.gender  |
+--------------+----------------+---------------+------------------+
| NULL         | NULL           | NULL          | NULL             |
| NULL         | NULL           | NULL          | NULL             |
| NULL         | NULL           | NULL          | NULL             |
+--------------+----------------+---------------+------------------+

How to resolve Loading NULL values into Hive Table?

You are getting NULL values loaded to the Hive table because your data is in comma-separated whereas Hive default separator is ^A hence Hive cannot recognizes your columns and loaded them as NULL values.

Since the default field/column terminator in Hive is ^, you need to explicitly mention your custom terminator explicitly using ROW FORMAT DELIMITED FIELDS TERMINATED BY while creating your table.

For example


0: jdbc:hive2://127.0.0.1:10000> CREATE TABLE IF NOT EXISTS emp.employee (
. . . . . . . . . . . . . . . .> id int,
. . . . . . . . . . . . . . . .> name string,
. . . . . . . . . . . . . . . .> age int,
. . . . . . . . . . . . . . . .> gender string)
. . . . . . . . . . . . . . . .> ROW FORMAT DELIMITED
. . . . . . . . . . . . . . . .> FIELDS TERMINATED BY ',';

Now load the same file into the table


LOAD DATA INPATH '/user/hive/data/data.txt' INTO table emp.employee;

Now, the table gets loaded with real values. SELECT * FROM emp.employee returns below output.


+--------------+----------------+---------------+------------------+
| employee.id  | employee.name  | employee.age  | employee.gender  |
+--------------+----------------+---------------+------------------+
| 1            | James          | 30            | M                |
| 2            | Ann            | 40            | F                |
| 3            | Jeff           | 41            | M                |
| 4            | Jennifer       | 20            | F                |
+--------------+----------------+---------------+------------------+

Hope this resolves loading NULL values to Hive table.

Happy Learning !!

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply