Pyspark: Exception: Java gateway process exited before sending the driver its port number

Problem: While running PySpark application through spark-submit, Spyder or even from PySpark shell I am getting Pyspark: Exception: Java gateway process exited before sending the driver its port number.

Solution: Pyspark: Exception: Java gateway process exited before sending the driver its port number

In order to run PySpark (Spark with Python) you would need to have Java installed on your Mac, Linux or Windows, without Java installation & not having JAVA_HOME environment variable set with Java installation path or not having PYSPARK_SUBMIT_ARGS, you would get Exception: Java gateway process exited before sending the driver its port number.

Set PYSPARK_SUBMIT_ARGS

Set PYSPARK_SUBMIT_ARGS with master, this resolves Exception: Java gateway process exited before sending the driver its port number.


export PYSPARK_SUBMIT_ARGS="--master local[3] pyspark-shell"

vi ~/.bashrc , add the above line and reload the bashrc file using source ~/.bashrc

Incase if issue still doesn’t resolve, check your Java installation and JAVA_HOME environment variable.

Install Open JDK

Why you need Java to run PySpark?

Spark basically written in Scala and later on due to its industry adaptation it’s API PySpark released for Python using Py4J. Py4J is a Java library that is integrated within PySpark and allows python to dynamically interface with JVM objects, hence to run PySpark you also need Java to be installed along with Python, and Apache Spark.

Use below commands to install OpenJDK or Oracle JDK on Linux Ubuntu.


# To Install Open JDK
sudo add-apt-repository ppa:openjdk-r/ppa
sudo apt-get update
sudo apt-get install openjdk-11-jdk

# To Install Oracke JDK varsion 8
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer

Set JAVA_HOME Environment Variable

Now export JAVA_HOME with the java installation directory.


export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64

vi ~/.bashrc , add the above line and reload the bashrc file using source ~/.bashrc

Happy Learning

Related Articles

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply