In this article, you will learn how to convert Unix timestamp (in seconds) as a long to Date and Date to seconds on the Spark DataFrame column using SQL Function unix_timestamp() with Scala examples.
NOTE: One thing you need to know is Unix epoch time in seconds does not hold milliseconds. hence, it’s not possible to extract milliseconds from Unix time.
First, let’ create a DataFrame with current_date() which gives current date and unix_timestamp() which gives current Unix timestamp (in seconds) as a long from 1970.
import spark.sqlContext.implicits._
val df = Seq(1).toDF("seq").select(
current_date().as("current_date"),
unix_timestamp().as("unix_timestamp_seconds")
)
df.printSchema()
df.show(false)
Yields below output
// Output:
root
|-- current_date: date (nullable = false)
|-- unix_timestamp_seconds: long (nullable = true)
+------------+----------------------+
|current_date|unix_timestamp_seconds|
+------------+----------------------+
|2019-12-23 |1577146238 |
+------------+----------------------+
1. Convert Unix epoch seconds to Date
Once we have a Spark DataFrame with the unix timestamp in seconds, let’s convert the unix_timestamp_seconds column to the Timestamp by casting seconds to TimestampType and then convert it to date using to_date() function.
// Convert unix seconds to date
df.select(
to_date(col("unix_timestamp_seconds").cast(TimestampType)).as("current_date")
).show(false)
Yields below output.
// Output:
+------------+
|current_date|
+------------+
|2019-12-23 |
+------------+
2. Convert Date to Unix epoch seconds
In this section let’s convert Date column to unix seconds using unix_timestamp()
function where it takes a Date column as an argument and returns seconds.
// Convert date to unix seconds
df.select(
unix_timestamp(col("current_date")).as("unix_seconds"),
unix_timestamp(lit("12-21-2019"),"mm-DD-yyyy").as("unix_seconds2")
).show(false)
Yields below output.
// Output:
+-----------------+------------------+
|unix_seconds |unix_seconds2 |
+-----------------+------------------+
|1577088000 |1548058320 |
+-----------------+------------------+
3. Source code for reference
package com.sparkbyexamples.spark.dataframe.functions.datetime
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions._
import org.apache.spark.sql.types.{DateType, LongType, TimestampType}
object DateInMilli extends App{
val spark:SparkSession = SparkSession.builder()
.master("local")
.appName("SparkByExamples.com")
.getOrCreate()
spark.sparkContext.setLogLevel("ERROR")
import spark.sqlContext.implicits._
val df = Seq(1).toDF("seq").select(
current_date().as("current_date"),
unix_timestamp().as("unix_timestamp_seconds")
)
df.printSchema()
df.show(false)
// Convert seconds to date
df.select(
to_date(col("unix_timestamp_seconds").cast(TimestampType)).as("current_date")
).show(false)
// Convert date to seconds
df.select(
unix_timestamp(col("current_date")).as("unix_seconds"),
unix_timestamp(lit("12-21-2019"),"mm-DD-yyyy").as("unix_seconds2")
).show(false)
}
The complete code is available at GitHub project for reference
Conclusion
In this article, you have learned how to convert Date to Unix epoch seconds using unix_timestamp()
function and Unix epoch seconds to Date using a cast on the DataFrame column with Scala example.
Thanks for the article!
Please fix a mistake: unix_timestamp().as(“milliseconds”) -> unix_timestamp().as(“seconds”)
Function unix_timestamp returns seconds since 1970
thanks, Alexander for correction. My bad. I have fixed it now.
Thanks for the article!
Please fix a mistake: unix_timestamp().as(“milliseconds”) -> unix_timestamp().as(“seconds”)
Function unix_timestamp returns seconds since 1970
thanks, Alexander for correction. My bad. I have fixed it now.