In this article, I will explain step-by-step how to do Apache Spark Installation on windows os 7, 10, and the latest version and also explain how to start a history server and monitor your jobs using Web UI.
Related:
Install Java 8 or Later
To install Apache Spark on windows, you would need Java 8 or the latest version hence download the Java version from Oracle and install it on your system. If you wanted OpenJDK you can download it from here.
After download, double click on the downloaded .exe (jdk-8u201-windows-x64.exe
) file in order to install it on your windows system. Choose any custom directory or keep the default location.
Note: This article explains Installing Apache Spark on Java 8, same steps will also work for Java 11 and 13 versions.
Apache Spark Installation on Windows
Apache Spark comes in a compressed tar/zip files hence installation on windows is not much of a deal as you just need to download and untar the file. Download Apache spark by accessing the Spark Download page and select the link from “Download Spark (point 3 from below screenshot)”.
If you wanted to use a different version of Spark & Hadoop, select the one you wanted from the drop-down; the link on point 3 changes to the selected version and provides you with an updated link to download.

After download, untar the binary using 7zip or any zip utility to extract the zip file and copy the extracted directory spark-3.0.0-bin-hadoop2.7
to c:\apps\opt\spark-3.0.0-bin-hadoop2.7
Spark Environment Variables
Post Java and Apache Spark installation on windows, set JAVA_HOME
, SPARK_HOME
, HADOOP_HOME
and PATH
environment variables. If you know how to set the environment variable on windows, add the following.
JAVA_HOME = C:\Program Files\Java\jdk1.8.0_201
PATH = %PATH%;%JAVA_HOME%
SPARK_HOME = C:\apps\opt\spark-3.0.0-bin-hadoop2.7
HADOOP_HOME = C:\apps\opt\spark-3.0.0-bin-hadoop2.7
PATH=%PATH%;%SPARK_HOME%
Follow the below steps if you are not aware of how to add or edit environment variables on windows.
- Open System Environment Variables window and select Environment Variables.

2. On the following Environment variable screen, add SPARK_HOME
, HADOOP_HOME
, JAVA_HOME
by selecting the New option.

3. This opens up the New User Variables window where you can enter the variable name and value.
4. Now Edit the PATH variable

5. Add Spark, Java, and Hadoop bin location by selecting New option.

Spark with winutils.exe on Windows
Many beginners think Apache Spark needs a Hadoop cluster installed to run but that’s not true, Spark can run on AWS by using S3, Azure by using blob storage without Hadoop and HDFSe.t.c.
To run Apache Spark on windows, you need winutils.exe
as it uses POSIX like file access operations in windows using windows API.
winutils.exe enables Spark to use Windows-specific services including running shell commands on a windows environment.
Download winutils.exe for Hadoop 2.7 and copy it to %SPARK_HOME%\bin
folder. Winutils are different for each Hadoop version hence download the right version based on your Spark vs Hadoop distribution from https://github.com/steveloughran/winutils
Apache Spark shell
spark-shell is a CLI utility that comes with Apache Spark distribution, open command prompt, go to cd %SPARK_HOME%/bin
and type spark-shell
command to run Apache Spark shell. You should see something like below (ignore the error you see at the end). Sometimes it may take a minute or two for your Spark instance to initialize to get to the below screen.

Spark-shell also creates a Spark context web UI and by default, it can access from http://localhost:4041.
On spark-shell command line, you can run any Spark statements like creating an RDD, getting Spark version e.t.c
scala> spark.version
res2: String = 3.0.0
scala> val rdd = sc.parallelize(Array(1,2,3,4,5,6,7,8,9,10))
rdd: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize at console:24
scala>
This completes the installation of Apache Spark on Windows 7, 10, and any latest.
Where to go Next?
You can continue following the below document to see how you can debug the logs using Spark Web UI and enable the Spark history server or follow the links as next steps
Web UI on Windows
Apache Spark provides a suite of Web UIs (Jobs, Stages, Tasks, Storage, Environment, Executors, and SQL) to monitor the status of your Spark application, resource consumption of Spark cluster, and Spark configurations. On Spark Web UI, you can see how the operations are executed.

History Server
History server keeps a log of all Spark applications you submit by spark-submit
, spark-shell
. You can enable Spark to collect the logs by adding the below configs to spark-defaults.conf
file, conf file is located at %SPARK_HOME%/conf
directory.
spark.eventLog.enabled true
spark.history.fs.logDirectory file:///c:/logs/path
After setting the above properties, start the history server by starting the below command.
$SPARK_HOME/bin/spark-class.cmd org.apache.spark.deploy.history.HistoryServer
By default History server listens at 18080 port and you can access it from browser using http://localhost:18080/

By clicking on each App ID, you will get the details of the application in Spark web UI.
Conclusion
In summary, you have learned how to install Apache Spark on windows and run sample statements in spark-shell
, and learned how to start spark web-UI and history server.
If you have any issues, setting up, please message me in the comments section, I will try to respond with the solution.
Happy Learning !!
Hi,
Spark Setup on Hadoop Cluster with Yarn Page is not getting loaded completely. It will be great help if you can share step by step approach to instal spark on hadoop cluster with yarn
Hi Karthik, I have fixed it. Could you please check. Thanks for letting me know.
Download wunutils.exe for Hadoop 2.7 -> Please correct the meaning of winutils.