You are currently viewing How to replace NULL values with Default in Hive

Use nvl() function in Hive to replace all NULL values of a column with a default value, In this article, I will explain with an example.

Advertisements

You can use this function to

  • Replace all NULL values with -1 or 0 or any number for the integer column.
  • Replace all NULL values with empty space for string types.
  • Replace with any value based on your need.

Related: Spark Replace NULL Values with Empty String

Before we just into examples, first, let’s create a table and load some records into it.


CREATE TABLE IF NOT EXISTS employee (
 id int,
 name string,
 age int,
 gender string,
 salary decimal
)
COMMENT 'Employee Table'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ',';

And the table is loaded with the following records where it has NULL values on few columns.


jdbc:hive2://> SELECT * FROM employee;
+--------------+----------------+---------------+------------------+------------------+
| employee.id  | employee.name  | employee.age  | employee.gender  | employee.salary  |
+--------------+----------------+---------------+------------------+------------------+
| 1            | James          | 30            | M                | 10000            |
| 2            | Ann            | 40            | F                | 20000            |
| 3            | Jeff           | 41            | M                | 30000            |
| 4            | Jennifer       | 20            | F                | 40000            |
| 5            | Kum            | 30            | M                | NULL             |
| 7            | Jeni           | NULL          |                  | NULL             |
+--------------+----------------+---------------+------------------+------------------+
6 rows selected (0.149 seconds)

Hive Replace Null Values

Hive nvl() function takes two argument, first being a column name where you wanted to replace NULL values and the second being the default value you wanted to replace with.


0: jdbc:hive2://> select id,name,nvl(age,-1) as age,nvl(salary,-1) as salary from employee;
OK
+-----+-----------+------+---------+
| id  |   name    | age  | salary  |
+-----+-----------+------+---------+
| 1   | James     | 30   | 10000   |
| 2   | Ann       | 40   | 20000   |
| 3   | Jeff      | 41   | 30000   |
| 4   | Jennifer  | 20   | 40000   |
| 5   | Kum       | 30   | -1      |
| 7   | Jeni      | -1   | -1      |
+-----+-----------+------+---------+
6 rows selected (0.212 seconds)
0: jdbc:hive2://>

As you see it replaced the NULL values on the salary and age columns. You can also replace strings with an empty value.

Happy Learning !!

Leave a Reply