Spark SQL provides built-in standard sort functions define in DataFrame API, these come in handy when we need to make sorting on the DataFrame column. All these accept input as, column name in String and returns a Column type.
When possible try to leverage standard library as they are little bit more compile-time safety, handles null and perform better when compared to UDF’s. If your application is critical on performance try to avoid using custom UDF at all costs as UDF does not guarantee performance.
Spark SQL sort functions are grouped as “sort_funcs” in spark SQL, these sort functions come handy when we want to perform any ascending and descending operations on columns.
These are primarily used on the Sort function of the Dataframe or Dataset.
SPARK SQL SORT FUNCTION SYNTAX | SPARK FUNCTION DESCRIPTION |
---|---|
asc(columnName: String): Column | asc function is used to specify the ascending order of the sorting column on DataFrame or DataSet |
asc_nulls_first(columnName: String): Column | Similar to asc function but null values return first and then non-null values |
asc_nulls_last(columnName: String): Column | Similar to asc function but non-null values return first and then null values |
desc(columnName: String): Column | desc function is used to specify the descending order of the DataFrame or DataSet sorting column. |
desc_nulls_first(columnName: String): Column | Similar to desc function but null values return first and then non-null values. |
desc_nulls_last(columnName: String): Column | Similar to desc function but non-null values return first and then null values. |
asc() – ascending function
asc function is used to specify the ascending order of the sorting column on DataFrame or DataSet.
Syntax: asc(columnName: String): Column
asc_nulls_first() – ascending with nulls first
Similar to asc function but null values return first and then non-null values.
asc_nulls_first(columnName: String): Column
asc_nulls_last() – ascending with nulls last
Similar to asc function but non-null values return first and then null values.
asc_nulls_last(columnName: String): Column
desc() – descending function
desc function is used to specify the descending order of the DataFrame or DataSet sorting column.
desc(columnName: String): Column
desc_nulls_first() – descending with nulls first
Similar to desc function but null values return first and then non-null values.
desc_nulls_first(columnName: String): Column
desc_nulls_last() – descending with nulls last
Similar to desc function but non-null values return first and then null values.
desc_nulls_last(columnName: String): Column
Reference : Spark Functions scala code
Related Articles
- Spark SQL Inner Join with Examples
- Spark SQL Self Join With Example
- Spark – Sort multiple DataFrame columns
- Spark SQL Left Outer Join with Example
- Spark SQL Array Functions Complete List
- Spark SQL like() Using Wildcard Example
- Spark SQL – Select Columns From DataFrame