In this tutorial, we will show you a Spark SQL example of how to convert timestamp to date format using to_date()
function on DataFrame with Scala language.
to_date() – function formats Timestamp to Date.
Syntax: to_date(date:Column,format:String):Column
Spark Timestamp consists of value in the format “yyyy-MM-dd HH:mm:ss.SSSS” and date format would be ” yyyy-MM-dd”, Use to_date() function to truncate time from Timestamp or to convert the timestamp to date on Spark DataFrame column.
Using to_date() – Convert Timestamp string to Date
In this example, we will use to_date() function to convert TimestampType column to DateType column. The input to this function should be timestamp column or string in TimestampType format and it returns just date in DateType column.
val df = Seq(("2019-07-01 12:01:19.000"),
("2019-06-24 12:01:19.000"),
("2019-11-16 16:44:55.406"),
("2019-11-16 16:50:59.406")).toDF("input_timestamp")
//Timestamp String to DateType
df.withColumn("datetype",
to_date(col("input_timestamp"),"yyyy-MM-dd"))
.show(false)
Yields below output
+-----------------------+----------+
|input_timestamp |datetype |
+-----------------------+----------+
|2019-07-01 12:01:19.000|2019-07-01|
|2019-06-24 12:01:19.000|2019-06-24|
|2019-11-16 16:44:55.406|2019-11-16|
|2019-11-16 16:50:59.406|2019-11-16|
+-----------------------+----------+
Convert TimestampType (timestamp) to DateType (date)
This example converts the Spark TimestampType column to DateType.
//Timestamp type to DateType
df.withColumn("ts",to_timestamp(col("input_timestamp")))
.withColumn("datetype",to_date(col("ts")))
.show(false)
Yields below output:
+-----------------------+-----------------------+----------+
|input_timestamp |ts |datetype |
+-----------------------+-----------------------+----------+
|2019-07-01 12:01:19.000|2019-07-01 12:01:19 |2019-07-01|
|2019-06-24 12:01:19.000|2019-06-24 12:01:19 |2019-06-24|
|2019-11-16 16:44:55.406|2019-11-16 16:44:55.406|2019-11-16|
|2019-11-16 16:50:59.406|2019-11-16 16:50:59.406|2019-11-16|
+-----------------------+-----------------------+----------+
Using Cast – To convert timestamp
Here is another way to achieve the same using cast
function on a timestamp column. This returns the same output as above.
//Using Cast
df.withColumn("ts",to_timestamp(col("input_timestamp")))
.withColumn("datetype",col("ts").cast(DateType))
.show(false)
Complete code
package com.sparkbyexamples.spark.dataframe.functions.datetime
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions.{col, to_date, to_timestamp}
import org.apache.spark.sql.types.DateType
object TimestampToDate extends App {
val spark:SparkSession = SparkSession.builder()
.master("local")
.appName("SparkByExamples.com")
.getOrCreate()
spark.sparkContext.setLogLevel("ERROR")
import spark.sqlContext.implicits._
val df = Seq(("2019-07-01 12:01:19.000"),
("2019-06-24 12:01:19.000"),
("2019-11-16 16:44:55.406"),
("2019-11-16 16:50:59.406")).toDF("input_timestamp")
//Timestamp String to DateType
df.withColumn("datetype",
to_date(col("input_timestamp"),"yyyy-MM-dd"))
.show(false)
//Timestamp type to DateType
df.withColumn("ts",to_timestamp(col("input_timestamp")))
.withColumn("datetype",to_date(col("ts")))
.show(false)
//Using Cast
df.withColumn("ts",to_timestamp(col("input_timestamp")))
.withColumn("datetype",col("ts").cast(DateType))
.show(false)
}
This complete code is also available at GitHub to download.
In this example, you have learned how to convert the DataFrame timestamp to date column using to_date() and cast functions using a scala example.
Happy Learning !!