Spark – Sort multiple DataFrame columns

Spread the love

In Spark , sort, and orderBy functions of the DataFrame are used to sort multiple DataFrame columns, you can also specify asc for ascending and desc for descending to specify the order of the sorting.

When sorting on multiple columns, you can also specify certain columns to sort on ascending and certain columns on descending.

1. Using sort() to sort multiple columns

In Spark, We can use sort() function of the DataFrame to sort the multiple columns. If you wanted to ascending and descending, use asc and desc on Column.


// Using sort() to sort multiple columns
df.sort("department","state")
df.sort(col("department").asc,col("state").desc)

2. Using orderBy() to sort multiple columns

Alternatively, we can also use orderBy() function of the DataFrame to sort the multiple columns. and use asc for ascending and desc for descending.


// Using orderBy() to sort multiple columns
df.orderBy("department","state")
df.orderBy(col("department").asc,col("state").desc)

Happy Learning !!

Naveen (NNK)

I am Naveen (NNK) working as a Principal Engineer. I am a seasoned Apache Spark Engineer with a passion for harnessing the power of big data and distributed computing to drive innovation and deliver data-driven insights. I love to design, optimize, and managing Apache Spark-based solutions that transform raw data into actionable intelligence. I am also passion about sharing my knowledge in Apache Spark, Hive, PySpark, R etc.

Leave a Reply

You are currently viewing Spark – Sort multiple DataFrame columns