Problem: When I tried to use
'sc' in PySpark program I am getting
Spark Context 'sc' Not Defined, But the sc is working in Spark/PySpark shell.
Solution: Spark Context ‘sc’ Not Defined?
'sc' is a SparkContext object that’s created upfront by default on spark-shell/pyspark shell, this object also available in Databricks however when you write PySpark program you need to create
SparkSession which internally create
If you are getting
Spark Context 'sc' Not Defined in Spark/PySpark shell use below export
export PYSPARK_SUBMIT_ARGS="--master local pyspark-shell"
vi ~/.bashrc , add the above line and reload the bashrc file using
source ~/.bashrc and launch spark-shell/pyspark shell.
Below is a way to use get SparkContext object in PySpark program.
# Import PySpark import pyspark from pyspark.sql import SparkSession #Create SparkSession spark = SparkSession.builder .master("local") .appName("SparkByExamples.com") .getOrCreate() sc=spark.sparkContext
In case if you get ‘
No module named pyspark‘ error, Follow steps mentioned in How to import PySpark in Python Script to resolve the error. In simple words just use
#Install findspark pip install findspark # Import findspark import findspark findspark.init() #import pyspark import pyspark from pyspark.sql import SparkSession #Create SparkSession which creates SparkContext.
Alternatively, you can also get object of SparkContext by using
Happy Learning !!