Spark By Examples | Learn Spark Tutorial with Examples

In this Apache Spark Tutorial, you will learn Spark with Scala examples and every example explain here is available at Spark-examples Github project for reference. All Spark examples provided in this Spark Tutorials are basic, simple, easy to practice for beginners who are enthusiastic to learn Spark and were tested in our development environment.

Note: In case if you can’t find the spark example you are looking for on this tutorial page, I would recommend using the Search option from the menu bar to find your tutorial.

Apache Spark Core

In this section of the tutorial, you will learn different concepts of the Spark Core library with examples.Spark Core is the main base library of the Spark which provides the abstraction of how distributed task dispatching, scheduling, basic I/O functionalities and etc.

Spark RDD Tutorial with Examples

RDD (Resilient Distributed Dataset) is a fundamental data structure of Spark and it is the primary data abstraction in Apache Spark and the Spark Core. RDDs are fault-tolerant, immutable distributed collections of objects, which means once you create an RDD you cannot change it. Each dataset in RDD is divided into logical partitions, which can be computed on different nodes of the cluster. 

This Spark RDD Tutorial will help you start understanding and using Apache Spark RDD (Resilient Distributed Dataset) with Scala examples. All RDD examples provided in this Tutorial were also tested in our development environment and are available at GitHub spark scala examples project for quick reference.

Spark DataFrame Tutorial with Examples

In this Spark SQL DataFrame Tutorial, I have explained several mostly used operation/functions on DataFrame & DataSet with working scala examples. This is a work in progress section where you will see more articles coming.

Spark Dataset Tutorial with Examples

SQL Functions

Spark SQL provides several built-in functions, When possible try to leverage standard library as they are a little bit more compile-time safety, handles null and perform better when compared to UDF’s. If your application is critical on performance try to avoid using custom UDF at all costs as these are not guarantee on performance.

In this section, we will see several Tutorials with Spark SQL functions using Scala examples.

Data Source Examples

Spark SQL supports operating on a variety of data sources through the DataFrame interface. This section of the tutorial describes reading and writing data using the Spark Data Sources with scala examples. Using Data source API we can load from or save data to RDMS databases, Avro, parquet, XML e.t.c.

Spark Streaming | Kafka Examples

Spark – Accessing HBase Examples

Learn Spark from these Books

References:

Close Menu