You are currently viewing Spark Get Day of Week & Week of Month

Problem: How to get a day of the week and week of the month from the Spark DataFrame Date and Timestamp column?

Solution: Using Spark SQL date_format() function, we can get any date & time unit from a Date & Timestamp columns, let’s see with some examples in Scala.

Refer to Spark SQL Date and Timestamp Functions for all Date & Time functions.

We can get the unit of the date by using the String formatting pattern using the Spark date function date_format(). This function supports all Date patterns described in Java DateTimeFormatter. for example

Before we start with examples, first let’s create a DataFrame to work with.


import org.apache.spark.sql.functions._
val spark: SparkSession = SparkSession.builder()
    .master("local")
    .appName("SparkByExamples.com")
    .getOrCreate()
spark.sparkContext.setLogLevel("ERROR")

import spark.sqlContext.implicits._

val data = Seq(("2019-07-01 12:01:19.000"),
    ("2019-06-24 12:01:19.000"),
    ("2019-11-16 16:44:55.406"),
    ("2019-11-16 16:50:59.406"))
val df=data.toDF("input_timestamp")

Get Day of the week from Date and Timestamp columns

This example uses Date formatting patterns to extracts Day of the week from the Spark date and timestamp DataFrame columns and the value would be between 1 to 7 where 1 is for Monday and 7 is for Sunday.

u – date formatting pattern is used to get the day of the week in a number. for example 1 for Monday and 7 for Sunday

E – date formatting pattern is used to get the day of the week in 3 characters for example ‘Mon’ for Monday.


import org.apache.spark.sql.functions._
df.withColumn("input_timestamp",
    to_timestamp(col("input_timestamp")))
    .withColumn("week_day_number", date_format(col("input_timestamp"), "u"))
    .withColumn("week_day_abb", date_format(col("input_timestamp"), "E"))
    .show(false)

Yields below output


+-----------------------+---------------+------------+
|input_timestamp        |week_day_number|week_day_abb|
+-----------------------+---------------+------------+
|2019-07-01 12:01:19    |1              |Mon         |
|2019-06-24 12:01:19    |1              |Mon         |
|2019-11-16 16:44:55.406|6              |Sat         |
|2019-11-16 16:50:59.406|6              |Sat         |
+-----------------------+---------------+------------+

Get Week of the Month from Date and Timestamp columns

This example extracts the week of the month from Spark Date and Timestamp DataFrame columns, possible return values would be 0 to 5.

W – date formatting pattern is used to get the week number of the month

EEEE – date formatting pattern is used to get the full week name for example Monday.


import org.apache.spark.sql.functions._
df.withColumn("input_timestamp",
    to_timestamp(col("input_timestamp")))
    .withColumn("week_day_full", date_format(col("input_timestamp"), "EEEE"))
    .withColumn("week_of_month", date_format(col("input_timestamp"), "W"))
    .show(false)

Yields below output.


+-----------------------+-------------+-------------+
|input_timestamp        |week_day_full|week_of_month|
+-----------------------+-------------+-------------+
|2019-07-01 12:01:19    |Monday       |1            |
|2019-06-24 12:01:19    |Monday       |5            |
|2019-11-16 16:44:55.406|Saturday     |3            |
|2019-11-16 16:50:59.406|Saturday     |3            |
+-----------------------+-------------+-------------+

This complete example is available for download from GitHub

Complete Example of Day of Week & Week of the Month


package com.sparkbyexamples.spark.dataframe.functions.datetime

import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions.{col, to_timestamp,date_format}

object DayWeekAndWeekMonth extends App {

  val spark: SparkSession = SparkSession.builder()
    .master("local")
    .appName("SparkByExamples.com")
    .getOrCreate()
  spark.sparkContext.setLogLevel("ERROR")

  import spark.sqlContext.implicits._

  val df = Seq(("2019-07-01 12:01:19.000"),
    ("2019-06-24 12:01:19.000"),
    ("2019-11-16 16:44:55.406"),
    ("2019-11-16 16:50:59.406")).toDF("input_timestamp")

  df.withColumn("input_timestamp",
    to_timestamp(col("input_timestamp")))
    .withColumn("week_day_number", date_format(col("input_timestamp"), "u"))
    .withColumn("week_day_abb", date_format(col("input_timestamp"), "E"))
    .show(false)

  df.withColumn("input_timestamp",
    to_timestamp(col("input_timestamp")))
    .withColumn("week_day_full", date_format(col("input_timestamp"), "EEEE"))
    .withColumn("week_of_month", date_format(col("input_timestamp"), "W"))
    .show(false)
}

Happy Learning !!

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium

This Post Has 2 Comments

  1. Anonymous

    How will you find a weeknumber from date column? If starting should be from Sunday instead of Monday(default)
    Weekofyear function calculate weeknumber starting from Monday.

  2. Anonymous

    format of “W” doesn’t seem to work with spark 3.0 upgrade, at least not in Databricks.

Comments are closed.