Spark SQL provides current_date()
and current_timestamp()
functions which returns the current system date without timestamp and current system data with timestamp respectively, Let’s see how to get these with Scala and Pyspark examples.
current_date() – function return current system date without time in Spark DateType
format “yyyy-MM-dd”
current_timestamp() – function returns current system date & timestamp in Spark TimestampType
format “yyyy-MM-dd HH:mm:ss”
First, let’s get the current date and time in TimestampType format and then will convert these dates into a different format. Note that I’ve used wihtColumn() to add new columns to the DataFrame
import spark.sqlContext.implicits._
// Get current Date & Time
val df = Seq((1)).toDF("seq")
val curDate = df.withColumn("current_date",current_date().as("current_date"))
.withColumn("current_timestamp",current_timestamp().as("current_timestamp"))
curDate.show(false)
Yields below output
// Output:
+---+------------+-----------------------+
|seq|current_date|current_timestamp |
+---+------------+-----------------------+
|1 |2019-11-16 |2019-11-16 21:00:55.349|
+---+------------+-----------------------+
Now let’s split the date & time into a separate column from ‘current_timestamp’ column and format the date into ‘MM-dd-yyyy’
curDate.select(date_format(col("current_timestamp"),"MM-dd-yyyy").as("date"),
date_format(col("current_timestamp"),"HH:mm:ss.SSS").as("time"),
date_format(col("current_date"), "MM-dd-yyyy").as("current_date_formateed"))
.show(false)
Yields below output
// Output:
+----------+------------+----------------------+
|date |time |current_date_formateed|
+----------+------------+----------------------+
|11-16-2019|21:00:55.705|11-16-2019 |
+----------+------------+----------------------+
Complete Example for reference
package com.sparkbyexamples.spark.dataframe.functions.datetime
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions._
object CurrentDateAndTime extends App {
val spark:SparkSession = SparkSession.builder()
.master("local")
.appName("SparkByExamples.com")
.getOrCreate()
spark.sparkContext.setLogLevel("ERROR")
import spark.sqlContext.implicits._
// Get current Date & Time
val df = Seq((1)).toDF("seq")
val curDate = df.withColumn("current_date",current_date().as("current_date"))
.withColumn("current_timestamp",current_timestamp().as("current_timestamp"))
curDate.show(false)
curDate.select(date_format(col("current_timestamp"),"MM-dd-yyyy").as("date"),
date_format(col("current_timestamp"),"HH:mm:ss.SSS").as("time"),
date_format(col("current_date"), "MM-dd-yyyy").as("current_date_formateed"))
.show(false)
}
The complete code can be downloaded from GitHub project
Happy Learning !!