zhaoke0727 发表于 2017-12-17 16:51:23

Hadoop 2.x完全分布式安装

  前期规划
  192.168.100.231                  db01
  192.168.100.232                  db02
  192.168.100.233                  db03
一、安装java
# vim /etc/profile
  在末尾添加环境变量:
  export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
  export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH

  export>  检查java是否安装成功:
# java -version
二、创建hadoop用户用于安装软件
  groupadd hadoop
  useradd -g hadoop hadoop
  echo "dbking588" | passwd --stdin hadoop
  配置环境变量:
  export HADOOP_HOME=/opt/cdh-5.3.6/hadoop-2.5.0
  export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH:$HOME/bin
三、安装hadoop
  # cd /opt/software
  # tar -zxvf hadoop-2.5.0.tar.gz -C /opt/cdh-5.3.6/
  # chown -R hadoop:hadoop /opt/cdh-5.3.6/hadoop-2.5.0
四、配置SSH
  --配置方法:
  $ ssh-keygen -t rsa
  $ ssh-copy-id db07.chavin.king
  (ssh-copy-id方式只能用于rsa加密秘钥配置,测试对于dsa加密配置无效)
  --验证:
$ ssh db02 date
  Wed Apr 19 09:57:34 CST 2017
五、编辑hadoop配置文件
  需要配置的文件包括:
  HDFS配置文件:
  etc/hadoop/hadoop-env.sh
  etc/hadoop/core-site.xml
  etc/hadoop/hdfs-site.xml
  etc/haoop/slaves
  YARN配置文件:
  etc/hadoop/yarn-env.sh
  etc/hadoop/yarn-site.xml
  etc/haoop/slaves
  MapReduce配置文件:
  etc/hadoop/mapred-env.sh
  etc/hadoop/mapred-site.xml
  配置文件内容如下:
$ cat etc/hadoop/core-site.xml
  <?xml version="1.0" encoding="UTF-8"?>
  <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
  <!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
  http://www.apache.org/licenses/LICENSE-2.0
  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
  -->
  <!-- Put site-specific property overrides in this file. -->
  <configuration>
  <property>
  <name>fs.defaultFS</name>
  <value>hdfs://db01:9000</value>
  </property>
  <property>
  <name>hadoop.tmp.dir</name>
  <value>/opt/cdh-5.3.6/hadoop-2.5.0/data/tmp</value>
  </property>
  <property>
  <name>fs.trash.interval</name>
  <value>7000</value>
  </property>
  </configuration>
$ cat etc/hadoop/hdfs-site.xml
  <?xml version="1.0" encoding="UTF-8"?>
  <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
  <!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
  http://www.apache.org/licenses/LICENSE-2.0
  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
  -->
  <!-- Put site-specific property overrides in this file. -->
  <configuration>
  <property>
  <name>dfs.namenode.secondary.http-address</name>
  <value>db03:50090</value>
  </property>
  </configuration>
$ cat etc/hadoop/yarn-site.xml
  <?xml version="1.0"?>
  <!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
  http://www.apache.org/licenses/LICENSE-2.0
  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
  -->
  <configuration>
  <property>
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>
  </property>
  <property>
  <name>yarn.resourcemanager.hostname</name>
  <value>db02</value>
  </property>
  <property>
  <name>yarn.log-aggregation-enable</name>
  <value>true</value>
  </property>
  <property>
  <name>yarn.log-aggregation.retain-seconds</name>
  <value>600000</value>
  </property>
  </configuration>
$ cat etc/hadoop/mapred-site.xml
  <?xml version="1.0"?>
  <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
  <!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
  http://www.apache.org/licenses/LICENSE-2.0
  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
  -->
  <!-- Put site-specific property overrides in this file. -->
  <configuration>
  <property>
  <name>mapreduce.framework.name</name>
  <value>yarn</value>
  </property>
  <property>
  <name>mapreduce.jobhistory.address</name>
  <value>db01:10020</value>
  </property>
  <property>
  <name>mapreduce.jobhistory.webapp.address</name>
  <value>db01:19888</value>
  </property>
  </configuration>
$ cat etc/hadoop/slaves
  db01
  db02
  db03
  在以下文件中修改Java环境变量:
  etc/hadoop/hadoop-env.sh
  etc/hadoop/yarn-env.sh
  etc/hadoop/mapred-env.sh
  创建数据目录:
  /opt/cdh-5.3.6/hadoop-2.5.0/data/tmp
六、格式化HDFS
$ hdfs namenode -format
七、启动hadoop
  *启动方式1:各个服务器逐一启动(比较常用,可编写shell脚本)
  hdfs:
  sbin/hadoop-daemon.sh start|stop namenode
  sbin/hadoop-daemon.sh start|stop datanode
  sbin/hadoop-daemon.sh start|stop secondarynamenode
  yarn:
  sbin/yarn-daemon.sh start|stop resourcemanager
  sbin/yarn-daemon.sh start|stop nodemanager
  mapreduce:
  sbin/mr-jobhistory-daemon.sh start|stop historyserver
  *启动方式2:各个模块分开启动:需要配置ssh对等性,需要在namenode上运行
  hdfs:
  sbin/start-dfs.sh
  sbin/start-yarn.sh
  yarn:
  sbin/stop-dfs.sh
  sbin/stop-yarn.sh
  *启动方式3:全部启动:不建议使用,这个命令需要在namenode上运行,但是会同时叫secondaryname节点也启动到namenode节点
  sbin/start-all.sh
  sbin/stop-all.sh
八、测试集群
$ cd ~/hadoop-2.5.2/share/hadoop/mapreduce/
$ hadoop jar hadoop-mapreduce-examples-2.5.2.jar pi 10 10
页: [1]
查看完整版本: Hadoop 2.x完全分布式安装