Problem: I created a new managed table in Hive and tried loading the data to the table, post LOAD
command, select shows all NULL
values as the data got loaded NULL for all rows and columns.
I’ve created a Hive table as shown below.
jdbc:hive2://127.0.0.1:10000> CREATE TABLE IF NOT EXISTS emp.employee (
. . . . . . . . . . . . . .> id int,
. . . . . . . . . . . . . .> name string,
. . . . . . . . . . . . . .> age int,
. . . . . . . . . . . . . .> gender string )
. . . . . . . . . . . . . .> COMMENT 'Employee Table';
I’ve a data file with comma separated fields, upload the data file (data.txt) to HDFS and loaded the file using below command.
LOAD DATA INPATH '/user/hive/data/data.txt' INTO table emp.employee;
selecting the table returns the following.
+--------------+----------------+---------------+------------------+
| employee.id | employee.name | employee.age | employee.gender |
+--------------+----------------+---------------+------------------+
| NULL | NULL | NULL | NULL |
| NULL | NULL | NULL | NULL |
| NULL | NULL | NULL | NULL |
+--------------+----------------+---------------+------------------+
How to resolve Loading NULL values into Hive Table?
You are getting NULL values loaded to the Hive table because your data is in comma-separated whereas Hive default separator is ^A hence Hive cannot recognizes your columns and loaded them as NULL values.
Since the default field/column terminator in Hive is ^, you need to explicitly mention your custom terminator explicitly using ROW FORMAT DELIMITED FIELDS TERMINATED BY
while creating your table.
For example
0: jdbc:hive2://127.0.0.1:10000> CREATE TABLE IF NOT EXISTS emp.employee (
. . . . . . . . . . . . . . . .> id int,
. . . . . . . . . . . . . . . .> name string,
. . . . . . . . . . . . . . . .> age int,
. . . . . . . . . . . . . . . .> gender string)
. . . . . . . . . . . . . . . .> ROW FORMAT DELIMITED
. . . . . . . . . . . . . . . .> FIELDS TERMINATED BY ',';
Now load the same file into the table
LOAD DATA INPATH '/user/hive/data/data.txt' INTO table emp.employee;
Now, the table gets loaded with real values. SELECT * FROM emp.employee
returns below output.
+--------------+----------------+---------------+------------------+
| employee.id | employee.name | employee.age | employee.gender |
+--------------+----------------+---------------+------------------+
| 1 | James | 30 | M |
| 2 | Ann | 40 | F |
| 3 | Jeff | 41 | M |
| 4 | Jennifer | 20 | F |
+--------------+----------------+---------------+------------------+
Hope this resolves loading NULL values to Hive table.
Happy Learning !!