In PySpark use date_format()
function to convert the DataFrame column from Date to String format. In this tutorial, we will show you a Spark SQL example of how to convert Date to String format using date_format()
function on DataFrame.
date_format()
– function formats Date to String format. This function supports all Java Date formats specified in DateTimeFormatter.
Following are Syntax and Example of date_format() Function:
Syntax: date_format(column,format)
Example: date_format(current_timestamp(),"yyyy MM dd").alias("date_format")
The below code snippet takes the current system date from current_date()
and timestamp from the current_timestamp()
function and converts it to String format on DataFrame.
from pyspark.sql.functions import *
df=spark.createDataFrame([["1"]],["id"])
df.select(current_date().alias("current_date"), \
date_format(current_timestamp(),"yyyy MM dd").alias("yyyy MM dd"), \
date_format(current_timestamp(),"MM/dd/yyyy hh:mm").alias("MM/dd/yyyy"), \
date_format(current_timestamp(),"yyyy MMM dd").alias("yyyy MMMM dd"), \
date_format(current_timestamp(),"yyyy MMMM dd E").alias("yyyy MMMM dd E") \
).show()
Output:
+------------+----------+----------------+------------+--------------------+
|current_date|yyyy MM dd| MM/dd/yyyy|yyyy MMMM dd| yyyy MMMM dd E|
+------------+----------+----------------+------------+--------------------+
| 2021-02-23|2021 02 23|02/23/2021 02:18| 2021 Feb 23|2021 February 23 Tue|
+------------+----------+----------------+------------+--------------------+
Alternatively, you can convert Data to String with SQL by using same functions.
#SQL
spark.sql("select current_date() as current_date, "+
"date_format(current_timestamp(),'yyyy MM dd') as yyyy_MM_dd, "+
"date_format(current_timestamp(),'MM/dd/yyyy hh:mm') as MM_dd_yyyy, "+
"date_format(current_timestamp(),'yyyy MMM dd') as yyyy_MMMM_dd, "+
"date_format(current_timestamp(),'yyyy MMMM dd E') as yyyy_MMMM_dd_E").show()
Complete Example of Convert Date to String
from pyspark.sql import SparkSession
# Create SparkSession
spark = SparkSession.builder \
.appName('SparkByExamples.com') \
.getOrCreate()
from pyspark.sql.functions import *
df=spark.createDataFrame([["1"]],["id"])
df.select(current_date().alias("current_date"), \
date_format(current_date(),"yyyy MM dd").alias("yyyy MM dd"), \
date_format(current_timestamp(),"MM/dd/yyyy hh:mm").alias("MM/dd/yyyy"), \
date_format(current_timestamp(),"yyyy MMM dd").alias("yyyy MMMM dd"), \
date_format(current_timestamp(),"yyyy MMMM dd E").alias("yyyy MMMM dd E") \
).show()
#SQL
spark.sql("select current_date() as current_date, "+
"date_format(current_timestamp(),'yyyy MM dd') as yyyy_MM_dd, "+
"date_format(current_timestamp(),'MM/dd/yyyy hh:mm') as MM_dd_yyyy, "+
"date_format(current_timestamp(),'yyyy MMM dd') as yyyy_MMMM_dd, "+
"date_format(current_timestamp(),'yyyy MMMM dd E') as yyyy_MMMM_dd_E").show()
Conclusion:
In this article, you have learned how to convert Date to String format using the Date function date_format()
.
Related Articles:
- PySpark – How to Get Current Date & Timestamp
- PySpark Convert String to Date Format
- PySpark SQL Date and Timestamp Functions
- Spark – How to get current date & timestamp
- PySpark to_date() – Convert Timestamp to Date
- Py Spark to date() – Convert String to Date Format
- Py Spark Add a New Column to Data Frame
- Py Spark – Create Data Frame with Examples