Among many other IDE’s IntelliJ IDEA is a most used IDE to run Spark application written in Scala due to it’s good Scala code completion, in this article, I will explain how to setup run an Apache Spark application written in Scala using Apache Maven with IntelliJ IDEA.
1. Install JDK
You might be aware that Spark created in Scala language and Scala is a JVM language that needs JVM to run hence, to compile & execute Spark application you need to have Java installed on your system.
Download and Install Java 8 or above from Oracle.com
2. Setup IntelliJ IDEA for Spark
Most of the Spark engineers use IntelliJ IDEA to run Spark applications written in Scala due to it’s good Scala compatibility hence, It’s better to have a development environment setup using IntelliJ.
IntelliJ IDEA comes with community & ultimate edition, In order to run the Spark application written in Scala, community edition is just enough for us, so download IntelliJ IDEA community edition.
- You can either download windows installer(.exe) or as a compressed zip (.zip) file based on your convenience. I’ve downloaded the .zip file.

2. Now, let’s unzip either using Winzip, 7-Zip, or any other zip extracts you have. I’ve used 7-Zip to extract the contents to the folder.



3. Move the extracted folder from Downloads to your working folder. In my case, I am moving it to c:\apps\
.
4. Start IntelliJ IDE by running idea64.exe
from C:\apps\ideaIC-2020.2.1.win\bin\idea64.exe
3. Create a Scala project In IntelliJ
After starting an IntelliJ IDEA IDE, you will get a Welcome screen with different options.
- Select New Project to open New Project window.



2. Select Maven from the left panel
3. Check option Create from archetype
4. Select org.scala-tools.archetypes:scala-archetypes-simple.
- The archetype is a kind of templates that creates the right directory structure and downloads the required default dependencies. Since we have selected Scala archetypes, it downloads all Scala dependencies and enables IntelliJ to write Scala code.
5. In the next window, enter the project name. I am naming my project as spark-hello-world-example.
6. On next screen, review the options for artifact-id and group-id
7. Select Finish.



You will see the project created on IntelliJ and shows the project structure on left Project panel.
4. Install Scala Plugin
Now navigate to
- Open File > Settings (or using shot keys Ctrl + Alt + s )
- Select the Plugins option from the left panel. This brings you Feature panel.
- Click on Install to install the Scala plugin.



4. After plugin install, restart the IntelliJ IDE.
5. Setup Scala SDK
- IntelliJ will prompt you as shown below to Setup Scala SDK.



2. Select Setup Scala SDK, it prompts you the below window,
3. Select the create option.



4. From the next window select the Download option and
5. Choose the Scala version 2.12.12 (latest at the time of writing this article)
6. Make changes to pom.xml file
Now, we need to make some changes in the pom.xml file, you can either follow the below instructions or download the pom.xml file GitHub project and replace into your pom.xml file.
- First, change the Scala version to the latest version, I am using 2.12.12
<properties>
<scala.version>2.12.12</scala.version>
</properties>
2. Remove following plugin
<plugin>
<groupId>org.scala-tools</groupId>
<artifactId>maven-scala-plugin</artifactId>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
<configuration>
<scalaVersion>${scala.version}</scalaVersion>
<args>
<arg>-target:jvm-1.5</arg>
</args>
</configuration>
</plugin>
7. Delete Unnecessary Files
Now delete the following from the project workspace.
- Delete src/test
- Delete src/main/scala/org.example.App



8. Add Spark Dependencies to Maven pom.xml File
Add Spark dependencies to pom.xml file
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>3.0.0</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>3.0.0</version>
<scope>compile</scope>
</dependency>
9. Create Spark Hello world Application on IntelliJ
1. Now create the Spark Hello world program. Our hello world example doesn’t display “Hello World” text instead it creates a SparkSession and displays Spark app name, master and deployment mode to console.
package org.example
import org.apache.spark.sql.SparkSession
object SparkSessionTest extends App{
val spark = SparkSession.builder()
.master("local[1]")
.appName("SparkByExample")
.getOrCreate();
println("First SparkContext:")
println("APP Name :"+spark.sparkContext.appName);
println("Deploy Mode :"+spark.sparkContext.deployMode);
println("Master :"+spark.sparkContext.master);
val sparkSession2 = SparkSession.builder()
.master("local[1]")
.appName("SparkByExample-test")
.getOrCreate();
println("Second SparkContext:")
println("APP Name :"+sparkSession2.sparkContext.appName);
println("Deploy Mode :"+sparkSession2.sparkContext.deployMode);
println("Master :"+sparkSession2.sparkContext.master);
}
2. Some time the dependencies in pom.xml are not automatic loaded hence, re-import the dependencies or restart the IntelliJ.
3. Run the Maven build.



4. Finally Run the Spark application SparkSessionTest
5. This should display below output on the console. In case if you still get errors during the running of the Spark application, please restart the IntelliJ IDE and run the application again. Now you should see the below message in the console.



If you have any questions or error while setting up the Spark on IntelliJ, please comment or ask me a question on Ask me
What to read next?
Once you complete the Spark Setup, you should know what is Spark Session, what is Spark Context and read Spark RDD, Spark RDD Actions, Spark RDD Transformations
Happy Learning !!
Excellent effort .. thank you so much for sharing this across..