Site icon Spark By {Examples}

PySpark – How to Get Current Date & Timestamp

pyspark current date and timestamp

PySpark SQL provides current_date() and current_timestamp() functions which return the system current date (without timestamp) and the current timestamp respectively, Let’s see how to get these with examples.

current_date() – function return current system date without time in PySpark DateType which is in format yyyy-MM-dd.

current_timestamp() – function returns current system date & timestamp in PySpark TimestampType which is in format yyyy-MM-dd HH:mm:ss.SSS

Note that I’ve used PySpark wihtColumn() to add new columns to the DataFrame


from pyspark.sql import SparkSession

# Create SparkSession
spark = SparkSession.builder \
               .appName('SparkByExamples.com') \
               .getOrCreate()
data=[["1"]]
df=spark.createDataFrame(data,["id"])

from pyspark.sql.functions import *
#current_date() & current_timestamp()
df.withColumn("current_date",current_date()) \
  .withColumn("current_timestamp",current_timestamp()) \
  .show(truncate=False)

Yields below output


+---+------------+-----------------------+
|id |current_date|current_timestamp      |
+---+------------+-----------------------+
|1  |2021-02-23  |2021-02-23 08:19:18.036|
+---+------------+-----------------------+

PySpark SQL- Get Current Date & Timestamp

If you are using SQL, you can also get current Date and Timestamp using.


spark.sql("select current_date(), current_timestamp()")
     .show(truncate=False)

Now see how to format the current date & timestamp into a custom format using date patterns. PySpark supports all patterns supports on Java DateTimeFormatter.

This example converts the date to MM-dd-yyyy using date_format() function and timestamp to MM-dd-yyyy HH mm ss SSS using to_timestamp().


df.withColumn("date_format",date_format(current_date(),"MM-dd-yyyy")) \
  .withColumn("to_timestamp",to_timestamp(current_timestamp(),"MM-dd-yyyy HH mm ss SSS")) \
  .show(truncate=False)

#SQL
spark.sql("select date_format(current_date(),'MM-dd-yyyy') as date_format ," + \
          "to_timestamp(current_timestamp(),'MM-dd-yyyy HH mm ss SSS') as to_timestamp") \
     .show(truncate=False)

For DataFrame example, it yields below output.


+---+-----------+-----------------------+
|id |date_format|to_timestamp           |
+---+-----------+-----------------------+
|1  |02-23-2021 |2021-02-23 08:26:22.794|
+---+-----------+-----------------------+

Complete Example for reference


from pyspark.sql import SparkSession

# Create SparkSession
spark = SparkSession.builder \
               .appName('SparkByExamples.com') \
               .getOrCreate()
data=[["1"]]
df=spark.createDataFrame(data,["id"])

from pyspark.sql.functions import *

#current_date() & current_timestamp()
df.withColumn("current_date",current_date()) \
  .withColumn("current_timestamp",current_timestamp()) \
  .show(truncate=False)

#SQL
spark.sql("select current_date(), current_timestamp()") \
     .show(truncate=False)

# Date & Timestamp into custom format
df.withColumn("date_format",date_format(current_date(),"MM-dd-yyyy")) \
  .withColumn("to_timestamp",to_timestamp(current_timestamp(),"MM-dd-yyyy HH mm ss SSS")) \
  .show(truncate=False)

#SQL
spark.sql("select date_format(current_date(),'MM-dd-yyyy') as date_format ," + \
          "to_timestamp(current_timestamp(),'MM-dd-yyyy HH mm ss SSS') as to_timestamp") \
     .show(truncate=False)

The complete code can be downloaded from GitHub project

Happy Learning !!

Exit mobile version