How to Run a PySpark Script from Python?

Question: How to run/submit (spark-submit) PySpark application from another Python script as a sub process and get the status of the job? Solution: Run PySpark Application as a Python process Generally, PySpark (Spark with Python) application should be run by using spark-submit script from shell or by using Airflow/Oozie/Luigi or…

Continue Reading How to Run a PySpark Script from Python?

Difference between spark-submit vs pyspark commands?

When you are learning Spark, you will have a question why do we need spark-submit and PySpark commands, I would take a moment of your time and explain the differences between these two. pyspark is a REPL similar to spark-shell for Python language.spark-submit is used to submit Spark application on…

Continue Reading Difference between spark-submit vs pyspark commands?