You are currently viewing Spark | PySpark Versions Supportability Matrix

Spark’s or PySpark’s support for various Python, Java, and Scala versions advances with each release, embracing language enhancements and optimizations. So, it is important to understand what Python, Java, and Scala versions Spark/PySpark supports to leverage its capabilities effectively.

Advertisements

Related: Spark 3.5.0 Compatible Java and Scala Versions

Using an incorrect or unsupported Python, Java, or Scala version with Spark might result in various issues or errors when running Spark applications or working within the Spark environment; hence, it is always best practice to install the right compatibility versions.

Spark Compatibility Matrix of Java and Scala

If you are using Spark with Scala, you must install the right Java and Scala versions. Here’s a table summarizing Spark versions along with their compatible Java and Scala versions:

Spark VersionCompatible Java VersionsCompatible Scala Versions
Spark 1.xJava 7 or laterScala 2.10.x
Spark 2.0 – 2.3Java 7 or laterScala 2.11.x
Spark 2.4.xJava 8, 11Scala 2.11.x, 2.12.x
Spark 3.0.xJava 8, 11, 16Scala 2.12.x
Spark 3.1.xJava 8, 11, 16Scala 2.12.x, 2.13.x
Spark 3.2.xJava 8, 11, 16Scala 2.12.x, 2.13.x
Spark 3.3.xJava 8, 11, 16Scala 2.12.x, 2.13.x
Spark 3.4.xJava 8, 11, 16Scala 2.12.x, 2.13.x
Spark 3.5.xJava 8, 11, 17Scala 2.12.x, 2.13.x
Spark’s Compatibility with Java and Scala Versions

While these are common compatibilities for each Spark version, it’s always advisable to refer to the official Spark documentation or release notes for the most accurate and updated information regarding compatible Java and Scala versions for a specific Spark release. Compatibility might vary slightly or be enhanced in minor releases within a major Spark version.

PySpark Compatible Matrix of Python Versions

If you use Spark with Python (PySpark), you must install the right Java and Python versions. Here’s a table summarizing PySpark versions along with their compatible and supported Python versions:

PySpark VersionCompatible Python Versions
PySpark 1.xPython 2.6, 2.7
PySpark 2.0 – 2.3Python 2.7, 3.4, 3.5, 3.6
PySpark 2.4.xPython 3.5, 3.6, 3.7, 3.8
PySpark 3.0.xPython 3.6, 3.7, 3.8, 3.9
PySpark 3.1.xPython 3.6, 3.7, 3.8, 3.9
PySpark 3.2.xPython 3.6, 3.7, 3.8, 3.9
PySpark 3.3.xPython 3.6, 3.7, 3.8, 3.9
PySpark 3.4.xPython 3.6, 3.7, 3.8, 3.9
PySpark 3.5.xPython 3.8, 3.9

Please note that these are general compatibility guidelines for PySpark versions and Python versions. Always refer to the official documentation or release notes for the specific PySpark version you use for the most accurate and updated information regarding compatible Python versions. Compatibility might vary slightly or be enhanced in minor releases within a major PySpark version.

Happy Learning !!

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium

This Post Has One Comment

Comments are closed.