You are currently viewing Spark convert Unix timestamp (seconds) to Date

In this article, you will learn how to convert Unix timestamp (in seconds) as a long to Date and Date to seconds on the Spark DataFrame column using SQL Function unix_timestamp() with Scala examples.

NOTE: One thing you need to know is Unix epoch time in seconds does not hold milliseconds. hence, it’s not possible to extract milliseconds from Unix time.

First, let’ create a DataFrame with current_date() which gives current date and unix_timestamp() which gives current Unix timestamp (in seconds) as a long from 1970.


  import spark.sqlContext.implicits._

  val df = Seq(1).toDF("seq").select(
    current_date().as("current_date"),
    unix_timestamp().as("unix_timestamp_seconds")
    )

  df.printSchema()
  df.show(false)

Yields below output


// Output:
root
 |-- current_date: date (nullable = false)
 |-- unix_timestamp_seconds: long (nullable = true)

+------------+----------------------+
|current_date|unix_timestamp_seconds|
+------------+----------------------+
|2019-12-23  |1577146238            |
+------------+----------------------+

1. Convert Unix epoch seconds to Date

Once we have a Spark DataFrame with the unix timestamp in seconds, let’s convert the unix_timestamp_seconds column to the Timestamp by casting seconds to TimestampType and then convert it to date using to_date() function.


  // Convert unix seconds to date
  df.select(
    to_date(col("unix_timestamp_seconds").cast(TimestampType)).as("current_date")
  ).show(false)

Yields below output.


// Output:
+------------+
|current_date|
+------------+
|2019-12-23  |
+------------+

2. Convert Date to Unix epoch seconds

In this section let’s convert Date column to unix seconds using unix_timestamp() function where it takes a Date column as an argument and returns seconds.


  // Convert date to unix seconds
  df.select(
    unix_timestamp(col("current_date")).as("unix_seconds"),
    unix_timestamp(lit("12-21-2019"),"mm-DD-yyyy").as("unix_seconds2")
  ).show(false)

Yields below output.


// Output:
+-----------------+------------------+
|unix_seconds     |unix_seconds2     |
+-----------------+------------------+
|1577088000       |1548058320        |
+-----------------+------------------+

3. Source code for reference


package com.sparkbyexamples.spark.dataframe.functions.datetime

import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions._
import org.apache.spark.sql.types.{DateType, LongType, TimestampType}

object DateInMilli extends App{

  val spark:SparkSession = SparkSession.builder()
    .master("local")
    .appName("SparkByExamples.com")
    .getOrCreate()
  spark.sparkContext.setLogLevel("ERROR")

  import spark.sqlContext.implicits._

  val df = Seq(1).toDF("seq").select(
    current_date().as("current_date"),
    unix_timestamp().as("unix_timestamp_seconds")
    )

  df.printSchema()
  df.show(false)

  // Convert seconds to date
  df.select(
    to_date(col("unix_timestamp_seconds").cast(TimestampType)).as("current_date")
  ).show(false)

 // Convert date to seconds
  df.select(
    unix_timestamp(col("current_date")).as("unix_seconds"),
    unix_timestamp(lit("12-21-2019"),"mm-DD-yyyy").as("unix_seconds2")
  ).show(false)

}

The complete code is available at GitHub project for reference

Conclusion

In this article, you have learned how to convert Date to Unix epoch seconds using unix_timestamp() function and Unix epoch seconds to Date using a cast on the DataFrame column with Scala example.

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium

This Post Has 2 Comments

  1. NNK

    thanks, Alexander for correction. My bad. I have fixed it now.

  2. Alexander

    Thanks for the article!
    Please fix a mistake: unix_timestamp().as(“milliseconds”) -> unix_timestamp().as(“seconds”)
    Function unix_timestamp returns seconds since 1970

Comments are closed.