Spark – Sort by column in descending order?

Spread the love

In order to sort by descending order in Spark DataFrame, we can use desc property of the Column class or desc() sql function. In this article, I will explain the sorting dataframe by using these approaches on multiple columns.

1. Using sort() for descending order

First, let’s do the sort.


// Using sort() for descending order
df.sort("department","state")

Now, let’s do the sort using desc property of Column class and In order to get column class we use col() SQL function


import org.apache.spark.sql.functions.col
df.sort(col("department").desc,col("state").desc)

Finally, let’s see how desc() SQL function by importing org.apache.spark.sql.functions.desc


import org.apache.spark.sql.functions.desc
df.sort(desc("department"),desc("state"))

2. Using orderBy() for descending

Alternatively, we can also use orderBy() function of the DataFrame to sort by descending order. All examples explained with sort() also works here.


// Using orderBy() for descending
df.orderBy("department","state")
import org.apache.spark.sql.functions.col
df.orderBy(col("department").desc,col("state").desc)
import org.apache.spark.sql.functions.desc
df.orderBy(desc("department"),desc("state"))

3. Using SQL to sort

And, we can also use SQL expression to sort by descending order.


// Using SQL to sort
df.createOrReplaceTempView("DEPT")
spark.sql(" select employee_name,desc('department'),desc('state'),salary,age,bonus from DEPT")

Happy Learning !!

Naveen (NNK)

I am Naveen (NNK) working as a Principal Engineer. I am a seasoned Apache Spark Engineer with a passion for harnessing the power of big data and distributed computing to drive innovation and deliver data-driven insights. I love to design, optimize, and managing Apache Spark-based solutions that transform raw data into actionable intelligence. I am also passion about sharing my knowledge in Apache Spark, Hive, PySpark, R etc.

Leave a Reply

You are currently viewing Spark – Sort by column in descending order?