Spark – Sort multiple DataFrame columns

  • Post author:

In Spark , sort, and orderBy functions of the DataFrame are used to sort multiple DataFrame columns, you can also specify asc for ascending and desc for descending to specify the order of the sorting.

When sorting on multiple columns, you can also specify certain columns to sort on ascending and certain columns on descending.

Using sort() to sort multiple columns

In Spark, We can use sort() function of the DataFrame to sort the multiple columns. If you wanted to ascending and descending, use asc and desc on Column.


df.sort("department","state")
df.sort(col("department").asc,col("state").desc)

Using orderBy() to sort multiple columns

Alternatively, we can also use orderBy() function of the DataFrame to sort the multiple columns. and use asc for ascending and desc for descending.


df.orderBy("department","state")
df.orderBy(col("department").asc,col("state").desc)

Happy Learning !!

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply