How to Run Spark Hello World Example in IntelliJ

Here, I will explain how to run Apache Spark Hello world example in IntelliJ on windows using Scala & Maven. I have a Spark basic example created at Apache Spark GitHub Examples project and I will clone this and use it to make it simple.

Make sure you have the IntelliJ IDE Setup and run Spark Application with Scala on Windows before you proceed.

Spark Maven Dependency

In order to run Spark Hello World Example on IntelliJ, you would need to have below Scala and Spark Maven dependencies.


    <dependency>
      <groupId>org.scala-lang</groupId>
      <artifactId>scala-library</artifactId>
      <version>${scala.version}</version>
    </dependency>

    <dependency>
      <groupId>org.specs</groupId>
      <artifactId>specs</artifactId>
      <version>1.2.5</version>
      <scope>test</scope>
    </dependency>

    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_2.11</artifactId>
      <version>${spark.version}</version>
      <scope>compile</scope>
    </dependency>

    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-sql_2.11</artifactId>
      <version>${spark.version}</version>
      <scope>compile</scope>
    </dependency>

Spark Hello World Example

In other languages to demonstrate Hello World, we would just print the statement in console, since Spark is a framework to process data in memory, I will show how to create a Spark Session object and print some details from the spark session object.


import org.apache.spark.sql.SparkSession

object SparkSessionTest {

  def main(args:Array[String]): Unit ={

    val spark = SparkSession.builder()
      .master("local[1]")
      .appName("SparkByExample")
      .getOrCreate();
    
    println("First SparkContext:")
    println("APP Name :"+spark.sparkContext.appName);
    println("Deploy Mode :"+spark.sparkContext.deployMode);
    println("Master :"+spark.sparkContext.master);

    val sparkSession2 = SparkSession.builder()
      .master("local[1]")
      .appName("SparkByExample-test")
      .getOrCreate();

    println("Second SparkContext:")
    println("APP Name :"+sparkSession2.sparkContext.appName);
    println("Deploy Mode :"+sparkSession2.sparkContext.deployMode);
    println("Master :"+sparkSession2.sparkContext.master);

  }
}

Spark GitHub Clone – Hello World Example Project

To make things simple, I have created a Spark Hello World project in GitHub, I will use this to run the example. First let’s clone the project, build, and run.

  • Open IntelliJ IDEA
  • Create a new project by selecting File > New > Project from Version Control.
Spark Hello World Example intellij

Using this option, we are going to import the project directly from GitHub repository.

Spark maven build intellij
  • On Get from Version Control window, select the Version control as Git and enter the below Github URL for URL and enter the directory where you wanted to clone.

https://github.com/spark-examples/spark-hello-world-example
  • If you don’t have Git installed, select the “Download and Install” option from the above window.
  • After Git installation, select the clone option which clones the project into your given folder.
  • This creates a new project on IntelliJ and starts cloning.
  • Now, wait for a few mins to complete the clone and also import the project into the workspace.

Once the cloning completes, you will see the project workspace structure on IntelliJ.

Run Maven build

Now run the Maven build. First, select the Maven from the right corner, navigate to Lifecycle > install, right-click, and select Run Maven Build.

Spark maven build intellij

This downloads all dependencies mentioned in the pom.xml file and compiles all examples in this tutorial. This also takes a few mins to complete and you should see the below message after a successful build.

Spark Hello World intellij

Run Hellow World Spark Program

After successful Maven build, run src/main/scala/com.sparkbyexamples.spark.SparkSessionTest

In case if you still get errors during the running of the Spark application, please restart the IntelliJ IDE and run the application again. Now you should see the below message in the console.

Spark Hello World Example

Where to go next?

Once you are able to run the Spark Hello Work example, you should read Spark RDD, Create Spark DataFrame, How to read CSV file into Spark

Happy Learning !!

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

This Post Has 2 Comments

  1. Micky Williamson

    Multiple versions of scala libraries detected!

    1. NNK

      Hi, By chance if you installed multiple Scala versions, please select either the 2.11 or 2.12 version.

Leave a Reply

How to Run Spark Hello World Example in IntelliJ