Five easy steps to install the latest version of Apache Spark on Mac (macOS) – In recent days Apache Spark installation on Mac OS has become very easy using Homebrew. You can install it and start running examples in just 5 mins. There are multiple ways to install Apache Spark on Mac.
Below I have explained the step-by-step installation of Apache Spark on Mac OS using Homebrew, validating the install, running spark-shell, and creating a Spark DataFrame.
Steps to Install the Latest Version of Apache Spark on Mac OS
- Step 1 – Install Homebrew
- Step 2 – Install Java
- Step 3 – Install Scala
- Step 4 – Install Apache Spark Latest Version
- Step 5 – Spart Spark shell and Validate Installation
Related: Apache Spark Installation on Windows
1. Install Apache Spark Latest Version on Mac
Homebrew is a Missing Package Manager for macOS that is used to install third-party packages like Java, and Apache Spark on Mac (macOS). In order to use Homebrew, first, you need to install it by using the below command.
# Install Homebrew /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
This prompts for the root password. You will need to type your root password to run this command. On a personal laptop, this is the same password you enter when you log into your Mac. If you don’t have root access, contact your system admin. You should see something like this below after the successful installation of homebrew.
Post-installation, you may need to run the below command to set the brew to your
# Set brew to Path echo 'eval "$(/opt/homebrew/bin/brew shellenv)"' >> /Users/admin/.zprofile eval "$(/opt/homebrew/bin/brew shellenv)"
If the above command has issues, you can find the latest command from Homebrew.
2. Install Java Version
Spark uses Java underlying hence you need to have Java on your Mac. Since Java is a third party, you can install it using the Homebrew command
brew. Since Oracle Java is not open source anymore, I am using the OpenJDK version 11. Run the below command in the terminal to install it.
# Install OpenJDK 11 brew install [email protected]
3. Install Scala
Since Apache Spark is written in Scala language it is obvious you would need Scala to run Spark programs.
# Install Scala (optional) brew install scala
4. Install Apache Spark on Mac
You can also download and install Apache Spark latest version on Mac using Homebrew. Some background about Spark, Apache Spark is an Open source analytical processing engine for large scale powerful distributed data processing and machine learning applications. Spark is Originally developed at the University of California, Berkeley’s, and later donated to Apache Software Foundation.
# Install Apache Spark brew install apache-spark
This installs the latest version of Apache Spark on your Mac OS.
After successful installation of Apache Spark latest version, run
spark-shell from the command line to launch Spark shell. You should see something like this below (ignore the warning for now). spark-shell is a CLI utility that comes with Apache Spark distribution.
Note that it displays the Spark version and Java version you are using on the terminal.
5. Validate Spark Installation from Shell
Let’s create a Spark DataFrame with some sample data to validate the installation. Enter the following commands in the Spark Shell in the same order.
import spark.implicits._ val data = Seq(("Java", "20000"), ("Python", "100000"), ("Scala", "3000")) val df = data.toDF() df.show()
Yields below output. For more examples on Apache Spark refer to PySpark Tutorial with Examples.
Now access http://localhost:4041/jobs/ from your favorite web browser to access Spark Web UI to monitor your jobs.
In this article, you have learned the step-by-step installation of Apache Spark latest version using Homebrew. Steps include installing Homebrew, Java, Scala, Apache Spark, and validating installation by running spark-shell.
Happy Learning !!
Where to go Next?
Since you have successfully installed Apache Spark latest version, you can learn more about the Spark framework by following the below articles.
- Apache Spark Setup with Scala and IntelliJ
- Spark Hello World Example in IntelliJ IDEA
- Spark Setup on Hadoop Cluster with Yarn
- What is SparkSession and How to create it?
- What is SparkContext and How to create it?
- How to Check Spark Version
- Spark Internal Execution plan
- Spark Drop, Delete, Truncate Differences
- Spark Convert a Row into Case Class
This Post Has 2 Comments
Awesome place to be at!
Thank you, I had struggled to get spark working until I found this post.