How to replace NULL values with Default in Hive

| *** Please Subscribe for Ad Free & Premium Content ***

Post author:Naveen Nelamali
Post category:Apache Hive
Post last modified:April 26, 2023
Reading time:4 mins read

You are currently viewing How to replace NULL values with Default in Hive

Use nvl() function in Hive to replace all NULL values of a column with a default value, In this article, I will explain with an example.

Advertisements

You can use this function to

Replace all NULL values with -1 or 0 or any number for the integer column.
Replace all NULL values with empty space for string types.
Replace with any value based on your need.

Before we just into examples, first, let’s create a table and load some records into it.


CREATE TABLE IF NOT EXISTS employee (
 id int,
 name string,
 age int,
 gender string,
 salary decimal
)
COMMENT 'Employee Table'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ',';

And the table is loaded with the following records where it has NULL values on few columns.


jdbc:hive2://> SELECT * FROM employee;
+--------------+----------------+---------------+------------------+------------------+
| employee.id  | employee.name  | employee.age  | employee.gender  | employee.salary  |
+--------------+----------------+---------------+------------------+------------------+
| 1            | James          | 30            | M                | 10000            |
| 2            | Ann            | 40            | F                | 20000            |
| 3            | Jeff           | 41            | M                | 30000            |
| 4            | Jennifer       | 20            | F                | 40000            |
| 5            | Kum            | 30            | M                | NULL             |
| 7            | Jeni           | NULL          |                  | NULL             |
+--------------+----------------+---------------+------------------+------------------+
6 rows selected (0.149 seconds)

Hive Replace Null Values

Hive nvl() function takes two argument, first being a column name where you wanted to replace NULL values and the second being the default value you wanted to replace with.


0: jdbc:hive2://> select id,name,nvl(age,-1) as age,nvl(salary,-1) as salary from employee;
OK
+-----+-----------+------+---------+
| id  |   name    | age  | salary  |
+-----+-----------+------+---------+
| 1   | James     | 30   | 10000   |
| 2   | Ann       | 40   | 20000   |
| 3   | Jeff      | 41   | 30000   |
| 4   | Jennifer  | 20   | 40000   |
| 5   | Kum       | 30   | -1      |
| 7   | Jeni      | -1   | -1      |
+-----+-----------+------+---------+
6 rows selected (0.212 seconds)
0: jdbc:hive2://>

As you see it replaced the NULL values on the salary and age columns. You can also replace strings with an empty value.

Happy Learning !!

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium

Hive Replace Null Values

Related Articles

Naveen Nelamali

Leave a Reply