I will quickly cover different ways to find the PySpark (Spark with python) installed version through the command line and runtime. You can use these options to check the PySpark version in Hadoop (CDH), Aws Glue, Anaconda, Jupyter notebook e.t.c on Mac, Linux, Windows, CentOS.
1. Find PySpark Version from Command Line
Like any other tools or language, you can use –version option with spark-submit
, spark-shell
, pyspark
and spark-sql
commands to find the PySpark version.
pyspark --version
spark-submit --version
spark-shell --version
spark-sql --version
All above spark-submit command, spark-shell command, pyspark shell command, and spark-sql
return the below output where you can check PySpark installed version.
As you see it displays the spark version along with Scala version 2.12.10 and Java version. For Java, I am using OpenJDK hence it shows the version as OpenJDK 64-Bit Server VM, 11.0-13
.
2. Check Version From Shell
Additionally, you are in pyspark-shell
and you wanted to check the PySpark version without exiting pyspark-shell, you can achieve this by using the sc.version
. sc
is a SparkContect variable that default exists in pyspark-shell
. Use the below steps to find the spark version.
- cd to
$SPARK_HOME/bin
- Launch
pyspark-shell
command - Enter
sc.version
orspark.version
sc.version
and spark.version
returns a version as a string type.
3. Find PySpark Version from Runtime
Imagine you are writing a PySpark application and you wanted to find the PySpark version during runtime, you can get it by accessing the version
or sparkContext.version
properties from the SparkSession object.
# Import PySpark
import pyspark
from pyspark.sql import SparkSession
# Create SparkSession
spark = SparkSession.builder.master("local[1]") \
.appName('SparkByExamples.com') \
.getOrCreate()
print('PySpark Version :'+spark.version)
print('PySpark Version :'+spark.sparkContext.version)
In this simple article, you have learned to check a PySpark version from the command line, pyspark shell, and runtime, you can use these from Hadoop (CDH), Aws Glue, Anaconda, Jupyter notebook e.t.c
Happy Learning !!
Related Articles
- PySpark cache() Explained.
- PySpark Write to CSV File
- PySpark SparkContext Explained
- PySpark Shell Command Usage with Examples
- Install PySpark in Jupyter on Mac using Homebrew
- Install PySpark in Anaconda & Jupyter Notebook
- How to Install PySpark on Mac (in 2022)