Apache Spark 3.x Install on Mac

Spread the love

Five easy steps to install Apache Spark 3 on Mac – In recent days Apache Spark 3.X installation on Mac OS has become very easy using Homebrew. You can install it and start running examples in just 5 mins. There are multiple ways to install Apache Spark on Mac.

Related: PySpark Installation on Mac

Below I have explained the step-by-step installation of Apache Spark on Mac OS using Homebrew, run Spark shell, validate the install and create a Spark DataFrame.

Steps to install Apache Spark on Mac OS

  • Step 1 – Install Homebrew
  • Step 2 – Install Java
  • Step 3 – Install Scala
  • Step 4 – Install Apache Spark
  • Step 5 – Spart Spark shell and Validate Installation

Related: Apache Spark Installation on Windows

1. Install Apache Spark 3 on Mac using Homebrew

Homebrew is a Missing Package Manager for macOS (or Linux) that is used to install third-party packages like Java, Apache Spark on Mac OS. In order to use this, first, you need to install it by using the below command.


# Install Homebrew
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

This prompts for the root password. You will need to type your root password to run this command. On a personal laptop, this is the same password you enter when you log into your Mac. If you don’t have root access, contact your system admin. You should see something like this below after the successful installation of homebrew.

homebrew install

Post-installation, you may need to run the below command to set the brew to your $PATH.


# Set brew to Path
echo 'eval "$(/opt/homebrew/bin/brew shellenv)"' >> /Users/admin/.zprofile
eval "$(/opt/homebrew/bin/brew shellenv)"

If the above command has issues, you can find the latest command from Homebrew.

2. Install Java Version

Spark uses Java underlying hence you need to have Java on your Mac. Since Java is a third party, you can install it using the Homebrew command brew. Since Oracle Java is not open source anymore, I am using the OpenJDK version 11. Run the below command in the terminal to install it.


# Install OpenJDK 11
brew install openjdk@11

3. Install Scala

Since Apache Spark is written in Scala language it is obvious you would need Scala to run Spark programs.


# Install Scala (optional)
brew install scala

4. Install Apache Spark 3 on Mac

You can also download and install Apache Spark 3 on Mac using Homebrew. Some background about Spark, Apache Spark is an Open source analytical processing engine for large scale powerful distributed data processing and machine learning applications. Spark is Originally developed at the University of California, Berkeley’s, and later donated to Apache Software Foundation.


# Install Apache Spark
brew install apache-spark

This installs Apache Spark on your Mac OS.

apache spark 3 install mac

After successful installation of Apache Spark run spark-shell from the command line to launch Spark shell. You should see something like this below (ignore the warning for now). spark-shell is a CLI utility that comes with Apache Spark distribution.

apache spark shell
Apache Spark Shell

Note that it displays the Spark version and Java version you are using on the terminal.

5. Validate Spark 3 Installation from Shell

Let’s create a Spark DataFrame with some sample data to validate the installation. Enter the following commands in the Spark Shell in the same order.


import spark.implicits._
val data = Seq(("Java", "20000"), ("Python", "100000"), ("Scala", "3000"))
val df = data.toDF() 
df.show()

Yields below output.

spark installation mac

Now access http://localhost:4041/jobs/ from your favorite web browser to access Spark Web UI to monitor your jobs.

Conclusion

In this Spark installation on Mac article, you have learned the step-by-step installation of Apache Spark using Homebrew. Steps include installing Homebrew, Java, Scala, Apache Spark, and validating installation by running spark-shell.

Happy Learning !!

Where to go Next?

Since you have successfully installed Apache Spark, you can learn more about the Spark framework by following the below articles.

Naveen (NNK)

I am Naveen (NNK) working as a Principal Engineer. I am a seasoned Apache Spark Engineer with a passion for harnessing the power of big data and distributed computing to drive innovation and deliver data-driven insights. I love to design, optimize, and managing Apache Spark-based solutions that transform raw data into actionable intelligence. I am also passion about sharing my knowledge in Apache Spark, Hive, PySpark, R etc.

Leave a Reply

You are currently viewing Apache Spark 3.x Install on Mac