Spark – Get a day of week & week of the month

  • Post author:

Problem: How to get a day of the week and week of the month from the Spark DataFrame Date and Timestamp column?

Solution: Using Spark SQL date_format() function, we can get any date & time unit from a Date & Timestamp columns, let’s see with some examples in Scala.

Refer to Spark SQL Date and Timestamp Functions for all Date & Time functions.

We can get the unit of the date by using the String formatting pattern using the Spark date function date_format(). This function supports all Date patterns described in Java DateTimeFormatter. for example

Before we start with examples, first let’s create a DataFrame to work with.


import org.apache.spark.sql.functions._
val spark: SparkSession = SparkSession.builder()
    .master("local")
    .appName("SparkByExamples.com")
    .getOrCreate()
spark.sparkContext.setLogLevel("ERROR")

import spark.sqlContext.implicits._

val data = Seq(("2019-07-01 12:01:19.000"),
    ("2019-06-24 12:01:19.000"),
    ("2019-11-16 16:44:55.406"),
    ("2019-11-16 16:50:59.406"))
val df=data.toDF("input_timestamp")

Get Day of the week from Date and Timestamp columns

This example uses Date formatting patterns to extracts Day of the week from the Spark date and timestamp DataFrame columns and the value would be between 1 to 7 where 1 is for Monday and 7 is for Sunday.

u – date formatting pattern is used to get the day of the week in a number. for example 1 for Monday and 7 for Sunday

E – date formatting pattern is used to get the day of the week in 3 characters for example ‘Mon’ for Monday.


import org.apache.spark.sql.functions._
df.withColumn("input_timestamp",
    to_timestamp(col("input_timestamp")))
    .withColumn("week_day_number", date_format(col("input_timestamp"), "u"))
    .withColumn("week_day_abb", date_format(col("input_timestamp"), "E"))
    .show(false)

Yields below output


+-----------------------+---------------+------------+
|input_timestamp        |week_day_number|week_day_abb|
+-----------------------+---------------+------------+
|2019-07-01 12:01:19    |1              |Mon         |
|2019-06-24 12:01:19    |1              |Mon         |
|2019-11-16 16:44:55.406|6              |Sat         |
|2019-11-16 16:50:59.406|6              |Sat         |
+-----------------------+---------------+------------+

Get Week of the Month from Date and Timestamp columns

This example extracts the week of the month from Spark Date and Timestamp DataFrame columns, possible return values would be 0 to 5.

W – date formatting pattern is used to get the week number of the month

EEEE – date formatting pattern is used to get the full week name for example Monday.


import org.apache.spark.sql.functions._
df.withColumn("input_timestamp",
    to_timestamp(col("input_timestamp")))
    .withColumn("week_day_full", date_format(col("input_timestamp"), "EEEE"))
    .withColumn("week_of_month", date_format(col("input_timestamp"), "W"))
    .show(false)

Yields below output.


+-----------------------+-------------+-------------+
|input_timestamp        |week_day_full|week_of_month|
+-----------------------+-------------+-------------+
|2019-07-01 12:01:19    |Monday       |1            |
|2019-06-24 12:01:19    |Monday       |5            |
|2019-11-16 16:44:55.406|Saturday     |3            |
|2019-11-16 16:50:59.406|Saturday     |3            |
+-----------------------+-------------+-------------+

This complete example is available for download from GitHub

Complete Example of Day of Week & Week of the Month


package com.sparkbyexamples.spark.dataframe.functions.datetime

import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions.{col, to_timestamp,date_format}

object DayWeekAndWeekMonth extends App {

  val spark: SparkSession = SparkSession.builder()
    .master("local")
    .appName("SparkByExamples.com")
    .getOrCreate()
  spark.sparkContext.setLogLevel("ERROR")

  import spark.sqlContext.implicits._

  val df = Seq(("2019-07-01 12:01:19.000"),
    ("2019-06-24 12:01:19.000"),
    ("2019-11-16 16:44:55.406"),
    ("2019-11-16 16:50:59.406")).toDF("input_timestamp")

  df.withColumn("input_timestamp",
    to_timestamp(col("input_timestamp")))
    .withColumn("week_day_number", date_format(col("input_timestamp"), "u"))
    .withColumn("week_day_abb", date_format(col("input_timestamp"), "E"))
    .show(false)

  df.withColumn("input_timestamp",
    to_timestamp(col("input_timestamp")))
    .withColumn("week_day_full", date_format(col("input_timestamp"), "EEEE"))
    .withColumn("week_of_month", date_format(col("input_timestamp"), "W"))
    .show(false)
}

Happy Learning !!

Spark day of the week

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply

This Post Has 2 Comments

  1. Anonymous

    How will you find a weeknumber from date column? If starting should be from Sunday instead of Monday(default)
    Weekofyear function calculate weeknumber starting from Monday.

  2. Anonymous

    format of “W” doesn’t seem to work with spark 3.0 upgrade, at least not in Databricks.

You are currently viewing Spark – Get a day of week & week of the month