Once you have Apache Hadoop Installation completes and able to run HDFS commands, the next step is to do Hadoop Yarn Configuration on Cluster. This post explains how to setup Yarn master on the Hadoop cluster and run a map-reduce example.
Before you proceed with this document, please make sure you have Apache Hadoop Installation and the Hadoop cluster is up and running.
By default Yarn comes with Hadoop distribution hence there is no need of additional installation, just you need to configure to use Yarn and some memory/core settings.
1. Configure yarn-site.xml
On yarn-site.xml
file, configure default node manager memory, yarn scheduler minimum, and maximum memory configurations.
<property> <name>yarn.nodemanager.resource.memory-mb</name> <value>1536</value> </property> <property> <name>yarn.scheduler.maximum-allocation-mb</name> <value>1536</value> </property> <property> <name>yarn.scheduler.minimum-allocation-mb</name> <value>128</value> </property> <property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> </property>
2. Configure mapred-site.xml file
Add below properties to mapred-site.xml
file
<property> <name>yarn.app.mapreduce.am.resource.mb</name> <value>512</value> </property> <property> <name>mapreduce.map.memory.mb</name> <value>256</value> </property> <property> <name>mapreduce.reduce.memory.mb</name> <value>256</value> </property> <property> <name>yarn.app.mapreduce.am.env</name> <value>HADOOP_MAPRED_HOME=$HADOOP_MAPRED_HOME</value> </property> <property> <name>mapreduce.map.env</name> <value>HADOOP_MAPRED_HOME=$HADOOP_MAPRED_HOME</value> </property> <property> <name>mapreduce.reduce.env</name> <value>HADOOP_MAPRED_HOME=$HADOOP_MAPRED_HOME</value> </property>
3. Configure Data Nodes
Copy yarn-site.xml
and mapred-site.xml
files to all 3 data nodes (I have 3 data nodes)
Below is an example to copy to datanode1 using the SCP command. repeat this setup for all your data nodes.
Related Articles
- Hadoop FS | HDFS DFS Commands with Examples
- Spark Step-by-Step Setup on Hadoop Yarn Cluster
- Apache Hadoop Installation on Ubuntu (multi-node cluster).
- Start H2O Cluster on Hadoop (External Backend)
- WARNING: “HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX.