PySpark SQL function provides to_date() function to convert String to Date fromat of a DataFrame column. Note that Spark Date Functions support all Java Date formats specified in DateTimeFormatter.
to_date() – function is used to format string (StringType
) to date (DateType
) column.
Syntax: to_date(column,format)
Example: to_date(col("string_column"),"MM-dd-yyyy")
This function takes the first argument as a date string and the second argument takes the pattern the date is in the first argument.
Below code snippet takes the String and converts it to Data format.
from pyspark.sql.functions import *
df=spark.createDataFrame([["02-03-2013"],["05-06-2023"]],["input"])
df.select(col("input"),to_date(col("input"),"MM-dd-yyyy").alias("date")) \
.show()
Output:
+----------+----------+
| input| date|
+----------+----------+
|02-03-2013|2013-02-03|
|05-06-2023|2023-05-06|
+----------+----------+
Alternatively, you can convert String to Date with SQL by using same functions.
spark.sql("select to_date('02-03-2013','MM-dd-yyyy') date") \
.show()
Complete Example
from pyspark.sql import SparkSession
# Create SparkSession
spark = SparkSession.builder \
.appName('SparkByExamples.com') \
.getOrCreate()
from pyspark.sql.functions import *
df=spark.createDataFrame([["02-03-2013"],["05-06-2023"]],["input"])
df.select(col("input"),to_date(col("input"),"MM-dd-yyyy").alias("date")) \
.show()
#SQL
spark.sql("select to_date('02-03-2013','MM-dd-yyyy') date").show()
Conclusion:
In this article, you have learned how to convert Date to String format using to_date() functions.
Related Articles:
- PySpark Convert String Type to Double Type
- PySpark Convert String to Array Column
- PySpark to_timestamp() – Convert String to Timestamp type
- PySpark Check Column Exists in DataFrame
- PySpark Convert DataFrame Columns to MapType (Dict)
- PySpark Add a New Column to DataFrame
- PySpark SQL – Working with Unix Time | Timestamp