Sparkling Water – java.lang.NoClassDefFoundError: org/apache/spark/repl/Main

Spread the love

While Running H2O Sparkling Water (Machine Learning Models) in the Spark cluster, you would probably get exception java.lang.NoClassDefFoundError: org/apache/spark/repl/Main and program fails.

I had this issue when I was running Sparkling Water with below configuration

  • Sparkling Water – sparkling-water-3.28.0.3-1-2.4
  • Spark – spark-2.4.4-bin-hadoop2.7 (with winutils)
  • Scala – 2.11.11
  • OS – Windows 10

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/repl/Main$
	at org.apache.spark.repl.h2o.H2OIMain$._classOutputDirectory$lzycompute(H2OIMain.scala:51)
	at org.apache.spark.repl.h2o.H2OIMain$._classOutputDirectory(H2OIMain.scala:50)
	at org.apache.spark.repl.h2o.H2OIMain$.classOutputDirectory(H2OIMain.scala:60)
	at org.apache.spark.repl.h2o.H2OInterpreter$.classOutputDirectory(H2OInterpreter.scala:84)
	at org.apache.spark.repl.h2o.H2OInterpreter.createSettings(H2OInterpreter.scala:66)
	at org.apache.spark.repl.h2o.BaseH2OInterpreter.initializeInterpreter(BaseH2OInterpreter.scala:100)
	at org.apache.spark.repl.h2o.BaseH2OInterpreter.(BaseH2OInterpreter.scala:290)
	at org.apache.spark.repl.h2o.H2OInterpreter.(H2OInterpreter.scala:38)
	at water.api.scalaInt.ScalaCodeHandler.createInterpreterInPool(ScalaCodeHandler.scala:145)
	at water.api.scalaInt.ScalaCodeHandler$$anonfun$initializeInterpreterPool$1.apply(ScalaCodeHandler.scala:139)
	at water.api.scalaInt.ScalaCodeHandler$$anonfun$initializeInterpreterPool$1.apply(ScalaCodeHandler.scala:138)
	at scala.collection.immutable.Range.foreach(Range.scala:160)
	at water.api.scalaInt.ScalaCodeHandler.initializeInterpreterPool(ScalaCodeHandler.scala:138)
	at water.api.scalaInt.ScalaCodeHandler.(ScalaCodeHandler.scala:42)
	at water.api.scalaInt.ScalaCodeHandler$.registerEndpoints(ScalaCodeHandler.scala:171)
	at water.api.CoreRestAPI$.registerEndpoints(CoreRestAPI.scala:32)
	at water.api.RestAPIManager.register(RestAPIManager.scala:39)
	at water.api.RestAPIManager.registerAll(RestAPIManager.scala:31)
	at org.apache.spark.h2o.backends.internal.InternalH2OBackend.init(InternalH2OBackend.scala:43)
	at org.apache.spark.h2o.H2OContext$H2OContextClientBased.initBackend(H2OContext.scala:450)
	at org.apache.spark.h2o.H2OContext.init(H2OContext.scala:150)
	at org.apache.spark.h2o.H2OContext$.getOrCreate(H2OContext.scala:608)
	at org.apache.spark.h2o.H2OContext$.getOrCreate(H2OContext.scala:636)
	at com.sparkbyexamples.spark.SparklingWaterExample$.delayedEndpoint$com$sparkbyexamples$spark$SparklingWaterExample$1(SparklingWaterExample.scala:13)
	at com.sparkbyexamples.spark.SparklingWaterExample$delayedInit$body.apply(SparklingWaterExample.scala:6)
	at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
	at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
	at scala.App$$anonfun$main$1.apply(App.scala:76)
	at scala.App$$anonfun$main$1.apply(App.scala:76)
	at scala.collection.immutable.List.foreach(List.scala:392)
	at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
	at scala.App$class.main(App.scala:76)
	at com.sparkbyexamples.spark.SparklingWaterExample$.main(SparklingWaterExample.scala:6)
	at com.sparkbyexamples.spark.SparklingWaterExample.main(SparklingWaterExample.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.repl.Main$
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 34 more

Solution

In my case, after adding Spark REPL maven dependency, my issues have been resolved and have not seen this exception anymore.

<dependency>
       <groupId>org.apache.spark</groupId>
       <artifactId>spark-repl_2.11</artifactId>
       <version>2.4.4</version>
 </dependency>

In case, if your issue has not resolved, please comment with Spark, Sparkling Water and Scala version you are using, I will happy to help.

Happy Learning !!

Naveen (NNK)

I am Naveen (NNK) working as a Principal Engineer. I am a seasoned Apache Spark Engineer with a passion for harnessing the power of big data and distributed computing to drive innovation and deliver data-driven insights. I love to design, optimize, and managing Apache Spark-based solutions that transform raw data into actionable intelligence. I am also passion about sharing my knowledge in Apache Spark, Hive, PySpark, R etc.

Leave a Reply