In this tutorial, you will learn how to install H2O Sparkling Water on Windows and running H2O sparkling-shell and H2O Flow web interface. In order to run Sparkling Water, you need to have an Apache Spark installed on your computer.
Sparkling Water enables users to run H2O machine learning algorithms on the Spark cluster which allows H2O to benefit from Spark capabilities like fast, scalable and distributed in-memory processing.
First, download Apache Spark, unzip the binary to a directory on your computer and have the SPARK_HOME environment variable set to the Spark home directory. I’ve downloaded spark-2.4.4-bin-hadoop2.7 version, Depending on when you reading this download the latest version available and the steps should not have changed much.
Now, download H2O Sparkling Water and unzip the downloaded file. In my case, I’ve download Sparkling Water version 3.28 which supports Spark 2.4.4 and unzip into
After successfully installation, open a command line on windows and change directory to your sparkling water bin directory. In my case C:\apps\opt\sparkling-water\bin.
To start Sparkling shell, enter
sparkling-shell on the command line and press enter which outputs something like below. This also initializes Spark Context with Web UI available at
http://192.168.56.1:4040 (change IP address to your system IP)
cd C:\apps\opt\sparkling-water\bin C:\apps\opt\sparkling-water\bin>sparkling-shell ----- Spark master (MASTER) : local[*] Spark home (SPARK_HOME) : C:\apps\opt\spark-2.4.4-bin-hadoop2.7 H2O build version : 22.214.171.124 (yu) Spark build version : 2.4.4 Scala version : 2.11 ---- 20/02/13 07:34:48 WARN NativeCodeLoader: Unable to load native-hadoop library fo r your platform... using builtin-java classes where applicable Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLeve l(newLevel). Spark context Web UI available at http://DELL-ESUHAO2KAJ:4040 Spark context available as 'sc' (master = local[*], app id = local-1581608102876 ). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.4.4 /_/ Using Scala version 2.11.12 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_191) Type in expressions to have them evaluated. Type :help for more information. scala>
Now let’s create H2OContext by taking SparkSession object “
spark” as a parameter, This creates an H2O Cloud inside the Spark Cluster.
scala> import org.apache.spark.h2o._ import org.apache.spark.h2o._ scala> val h2oContext = H2OContext.getOrCreate(spark) h2oContext: org.apache.spark.h2o.H2OContext = Sparkling Water Context: * Sparkling Water Version: 126.96.36.199-1-2.4 * H2O name: sparkling-water-prabha_local-1581608102876 * cluster size: 1 * list of used nodes: (executorId, host, port) ------------------------ (driver,192.168.56.1,54321) ------------------------ Open H2O Flow in browser: http://192.168.56.1:54321 (CMD + click in Mac OSX) scala>
This also runs an H2O Flow web UI interface to interact with H2O. Open H2O Flow in browser: http://192.168.56.1:54321 (change the IP address to your system IP)
In this article, you have learned to install H2O Sparkling Water on Windows OS and running sparkling-shell and finally created H2OContext where you can access the H2O Flow web UI interface.
Happy Learning !!