You are currently viewing Spark Kill Running Application or Job?

Spark running application can be kill by issuing “yarn application -kill <application id>” CLI command, we can also stop the running spark application in different ways, it all depends on how and where you are running your application.

In this Spark article, I will explain different ways to stop or kill the application or job.

How to find Spark Application ID

Regardless of where you are running your application, Spark and PySpark applications always have an Application ID and you would need this Application Id to stop the specific application.

When you submit your application on a Yarn cluster, you can find the Application ID on Yarn UI or from the Spark history server. It would be something like this application_16292842912342_34127

From Yarn UI : http://yarn-host:8088/cluster/apps/RUNNING

From Spark History server: http://history-server-url:18080, you can find the App ID similar to the one highlighted below.

Spark application ID to kill
Spark History Server

You can also, get the Spark Application Id, by running the following Yarn command.


yarn application -list 
yarn application -appStates RUNNING -list | grep "applicationName"

Kill Spark application running on Yarn cluster manager

Once you have an application ID, you can kill the application from any of the below methods.

Using yarn CLI


yarn application -kill application_16292842912342_34127

Using an API.

Please see more details here on how to use this.


PUT http://{resource-manager-host:port}/ws/v1/cluster/apps/{application-id}/state
{
  "state":"KILLED"
}

In case if you wanted to kill all applications that are in ACCEPTED state, put the following in a shell file and run it.


for x in $(yarn application -list -appStates ACCEPTED | awk 'NR > 2 { print $1 }'); do yarn application -kill $x; done

And to kill RUNNING applicaiton, replace the ACCEPTED with RUNNING on the above file.

Killing from Spark Web UI

If you don’t have access to Yarn CLI and Spark commands, you can kill the Spark application from the Web UI, by accessing the application master page of spark job.

  • Opening Spark application UI.
  • Select the jobs tab.
  • Find a job you wanted to kill.
  • Select kill to stop the Job.
Spark kill application

Stop Spark application running on Standalone cluster manager

You can also kill by calling the Spark client program. Here, you need to pass the <master url> and <driver id>


./bin/spark-class org.apache.spark.deploy.Client kill master-url driver-id

You can find the driver ID by accessing standalone Master web UI at http://spark-stanalone-master-url:8080.

Kill application running on client mode

In client mode, your application (Spark Driver) runs on a server where you issue Spark-submit command. In this mode to stop your application just type Ctrl-c to stop. This will exit from the application and prompt your command mode.

Conclusion

In summary, you can kill the Spark or PySpark application by issuing yarn CLI command, Spark command, and finally by using Spark Web UI.

Happy Learning !!

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium