Spark convert Unix timestamp (seconds) to Date

Spread the love

In this article, you will learn how to convert Unix timestamp (in seconds) as a long to Date and Date to seconds on the Spark DataFrame column using SQL Function unix_timestamp() with Scala examples.

NOTE: One thing you need to know is Unix epoch time in seconds does not hold milliseconds. hence, it’s not possible to extract milliseconds from Unix time.

First, let’ create a DataFrame with current_date() which gives current date and unix_timestamp() which gives current Unix timestamp (in seconds) as a long from 1970.


  import spark.sqlContext.implicits._

  val df = Seq(1).toDF("seq").select(
    current_date().as("current_date"),
    unix_timestamp().as("unix_timestamp_seconds")
    )

  df.printSchema()
  df.show(false)

Yields below output


root
 |-- current_date: date (nullable = false)
 |-- unix_timestamp_seconds: long (nullable = true)

+------------+----------------------+
|current_date|unix_timestamp_seconds|
+------------+----------------------+
|2019-12-23  |1577146238            |
+------------+----------------------+

Convert Unix epoch seconds to Date

Once we have a Spark DataFrame with the unix timestamp in seconds, let’s convert the unix_timestamp_seconds column to the Timestamp by casting seconds to TimestampType and then convert it to date using to_date() function.


  //Convert unix seconds to date
  df.select(
    to_date(col("unix_timestamp_seconds").cast(TimestampType)).as("current_date")
  ).show(false)

Yields below output.


+------------+
|current_date|
+------------+
|2019-12-23  |
+------------+

Convert Date to Unix epoch seconds

In this section let’s convert Date column to unix seconds using unix_timestamp() function where it takes a Date column as an argument and returns seconds.


  //convert date to unix seconds
  df.select(
    unix_timestamp(col("current_date")).as("unix_seconds"),
    unix_timestamp(lit("12-21-2019"),"mm-DD-yyyy").as("unix_seconds2")
  ).show(false)

Yields below output.


+-----------------+------------------+
|unix_seconds     |unix_seconds2     |
+-----------------+------------------+
|1577088000       |1548058320        |
+-----------------+------------------+

Source code for reference


package com.sparkbyexamples.spark.dataframe.functions.datetime

import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions._
import org.apache.spark.sql.types.{DateType, LongType, TimestampType}

object DateInMilli extends App{

  val spark:SparkSession = SparkSession.builder()
    .master("local")
    .appName("SparkByExamples.com")
    .getOrCreate()
  spark.sparkContext.setLogLevel("ERROR")

  import spark.sqlContext.implicits._

  val df = Seq(1).toDF("seq").select(
    current_date().as("current_date"),
    unix_timestamp().as("unix_timestamp_seconds")
    )

  df.printSchema()
  df.show(false)

  //Convert seconds to date
  df.select(
    to_date(col("unix_timestamp_seconds").cast(TimestampType)).as("current_date")
  ).show(false)

 //convert date to seconds
  df.select(
    unix_timestamp(col("current_date")).as("unix_seconds"),
    unix_timestamp(lit("12-21-2019"),"mm-DD-yyyy").as("unix_seconds2")
  ).show(false)

}

The complete code is available at GitHub project for reference

Conclusion

In this article, you have learned how to convert Date to Unix epoch seconds using unix_timestamp() function and Unix epoch seconds to Date using a cast on the DataFrame column with Scala example.

Naveen (NNK)

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply

This Post Has 4 Comments

  1. Alexander

    Thanks for the article!
    Please fix a mistake: unix_timestamp().as(“milliseconds”) -> unix_timestamp().as(“seconds”)
    Function unix_timestamp returns seconds since 1970

    1. NNK

      thanks, Alexander for correction. My bad. I have fixed it now.

  2. Alexander

    Thanks for the article!
    Please fix a mistake: unix_timestamp().as(“milliseconds”) -> unix_timestamp().as(“seconds”)
    Function unix_timestamp returns seconds since 1970

    1. NNK

      thanks, Alexander for correction. My bad. I have fixed it now.

You are currently viewing Spark convert Unix timestamp (seconds) to Date