Spark RDD fold() function example

In this tutorial, you will learn fold syntax, usage and how to use Spark RDD fold() function in order to calculate min, max, and a total of the elements with Scala example and the same approach could be used for Java and PySpark (python). Syntax def fold(zeroValue: T)(op: (T, T)…

Continue Reading Spark RDD fold() function example

Spark RDD reduce() function example

Spark RDD reduce() aggregate action function is used to calculate min, max, and total of elements in a dataset, In this tutorial, I will explain RDD reduce function syntax and usage with scala language and the same approach could be used with Java and PySpark (python) languages. Syntax def reduce(f:…

Continue Reading Spark RDD reduce() function example

Spark RDD aggregate() operation example

In this tutorial, you will learn how to aggregate elements using Spark RDD aggregate() action to calculate min, max, total, and count of RDD elements with scala language, and the same approach could be used for Java and PySpark (python). RDD aggregate() Syntax def aggregate[U](zeroValue: U)(seqOp: (U, T) ⇒ U,…

Continue Reading Spark RDD aggregate() operation example