Spark Get Day of Week & Week of Month

Problem: How to get a day of the week and week of the month from the Spark DataFrame Date and Timestamp column?

Solution: Using Spark SQL date_format() function, we can get any date & time unit from a Date & Timestamp columns, let’s see with some examples in Scala.

Refer to Spark SQL Date and Timestamp Functions for all Date & Time functions.

We can get the unit of the date by using the String formatting pattern using the Spark date function date_format(). This function supports all Date patterns described in Java DateTimeFormatter. for example

Before we start with examples, first let’s create a DataFrame to work with.


import org.apache.spark.sql.functions._
val spark: SparkSession = SparkSession.builder()
    .master("local")
    .appName("SparkByExamples.com")
    .getOrCreate()
spark.sparkContext.setLogLevel("ERROR")

import spark.sqlContext.implicits._

val data = Seq(("2019-07-01 12:01:19.000"),
    ("2019-06-24 12:01:19.000"),
    ("2019-11-16 16:44:55.406"),
    ("2019-11-16 16:50:59.406"))
val df=data.toDF("input_timestamp")

Get Day of the week from Date and Timestamp columns

This example uses Date formatting patterns to extracts Day of the week from the Spark date and timestamp DataFrame columns and the value would be between 1 to 7 where 1 is for Monday and 7 is for Sunday.

u – date formatting pattern is used to get the day of the week in a number. for example 1 for Monday and 7 for Sunday

E – date formatting pattern is used to get the day of the week in 3 characters for example ‘Mon’ for Monday.


import org.apache.spark.sql.functions._
df.withColumn("input_timestamp",
    to_timestamp(col("input_timestamp")))
    .withColumn("week_day_number", date_format(col("input_timestamp"), "u"))
    .withColumn("week_day_abb", date_format(col("input_timestamp"), "E"))
    .show(false)

Yields below output


+-----------------------+---------------+------------+
|input_timestamp        |week_day_number|week_day_abb|
+-----------------------+---------------+------------+
|2019-07-01 12:01:19    |1              |Mon         |
|2019-06-24 12:01:19    |1              |Mon         |
|2019-11-16 16:44:55.406|6              |Sat         |
|2019-11-16 16:50:59.406|6              |Sat         |
+-----------------------+---------------+------------+

Get Week of the Month from Date and Timestamp columns

This example extracts the week of the month from Spark Date and Timestamp DataFrame columns, possible return values would be 0 to 5.

W – date formatting pattern is used to get the week number of the month

EEEE – date formatting pattern is used to get the full week name for example Monday.


import org.apache.spark.sql.functions._
df.withColumn("input_timestamp",
    to_timestamp(col("input_timestamp")))
    .withColumn("week_day_full", date_format(col("input_timestamp"), "EEEE"))
    .withColumn("week_of_month", date_format(col("input_timestamp"), "W"))
    .show(false)

Yields below output.


+-----------------------+-------------+-------------+
|input_timestamp        |week_day_full|week_of_month|
+-----------------------+-------------+-------------+
|2019-07-01 12:01:19    |Monday       |1            |
|2019-06-24 12:01:19    |Monday       |5            |
|2019-11-16 16:44:55.406|Saturday     |3            |
|2019-11-16 16:50:59.406|Saturday     |3            |
+-----------------------+-------------+-------------+

This complete example is available for download from GitHub

Complete Example of Day of Week & Week of the Month


package com.sparkbyexamples.spark.dataframe.functions.datetime

import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions.{col, to_timestamp,date_format}

object DayWeekAndWeekMonth extends App {

  val spark: SparkSession = SparkSession.builder()
    .master("local")
    .appName("SparkByExamples.com")
    .getOrCreate()
  spark.sparkContext.setLogLevel("ERROR")

  import spark.sqlContext.implicits._

  val df = Seq(("2019-07-01 12:01:19.000"),
    ("2019-06-24 12:01:19.000"),
    ("2019-11-16 16:44:55.406"),
    ("2019-11-16 16:50:59.406")).toDF("input_timestamp")

  df.withColumn("input_timestamp",
    to_timestamp(col("input_timestamp")))
    .withColumn("week_day_number", date_format(col("input_timestamp"), "u"))
    .withColumn("week_day_abb", date_format(col("input_timestamp"), "E"))
    .show(false)

  df.withColumn("input_timestamp",
    to_timestamp(col("input_timestamp")))
    .withColumn("week_day_full", date_format(col("input_timestamp"), "EEEE"))
    .withColumn("week_of_month", date_format(col("input_timestamp"), "W"))
    .show(false)
}

Happy Learning !!

Naveen (NNK)

I am Naveen (NNK) working as a Principal Engineer. I am a seasoned Apache Spark Engineer with a passion for harnessing the power of big data and distributed computing to drive innovation and deliver data-driven insights. I love to design, optimize, and managing Apache Spark-based solutions that transform raw data into actionable intelligence. I am also passion about sharing my knowledge in Apache Spark, Hive, PySpark, R etc.

Leave a Reply

This Post Has 2 Comments

  1. Anonymous

    format of “W” doesn’t seem to work with spark 3.0 upgrade, at least not in Databricks.

  2. Anonymous

    How will you find a weeknumber from date column? If starting should be from Sunday instead of Monday(default)
    Weekofyear function calculate weeknumber starting from Monday.

You are currently viewing Spark Get Day of Week & Week of Month