hadoop集群环境搭建
hadoop大数据集群环境搭建步骤----安装伪分布式所需软件: vmware workstation 11.0
jdk1.7.0_67
hadoop-2.7.3
filezilla FTP工具
开始搭建步骤:
[*] 先安装一台linux服务器,(此步忽略) 需要的童鞋请到网上搜索安装linux系统
[*] 关防火墙
[*] service iptables stop
[*] 2.设置IP地址
[*] vi /etc/sysconfig/network-scripts/ifcfg-eth0
[*] 或者图像化修改!
[*] 3.设置network文件hosts映射文件
[*] vi /etc/hosts
[*] vi /etc/sysconfig/network
[*] 4.安装jdk
[*] 上传JDK解压
[*] 配置环境变量:
[*] vi /etc/profile
[*] source /etc/profile
[*] 5.安装hadoop
[*] 上传 hadoop-2.7.3.tar.gz
[*] 解压
[*] 6.配置hadoop:
[*] 注意:配置过程可以参考:
[*] 离线开发文档:
[*] D:\hadoop\tools\hadoop-2.7.3\hadoop-2.7.3\share\doc\hadoop\index.html
[*] 在线文档:
[*] 配置:
[*] core-site.xml:
[*]
[*]
[*] fs.defaultFS
[*] hdfs://Hadoop:9000端口:8020
[*]
[*]
[*] fs.defaultFS
[*] hdfs://hadoop-yarn.beicai.com:8020
[*]
[*]
[*] hadoop.tmp.dir
[*] /opt/modules/hadoop-2.5.0/data/tmp
[*]
[*] hdfs-site.xml:
[*]
[*]
[*] dfs.replication
[*] 1
[*]
[*]
[*] 先把mapred-site.xml.template 重命名为mapred-site.xml
[*] maperd-site.xml:
[*]
[*]
[*] mapreduce.framework.name
[*] yarn
[*]
[*]
[*] yarn-site.xml:
[*]
[*]
[*] yarn.nodemanager.aux-services
[*] mapreduce_shuffle
[*]
[*]
[*] hadoop-env.sh:
[*] 配置jdk:
[*] export JAVA_HOME=/opt/jdk
[*]
[*]
[*] 配置ssh免登陆:
[*] (1).cd /root/.ssh/
[*] (2).生成rsa秘钥:
[*] ssh-keygen -t rsa 一路回车!!!
[*] (3).查看秘钥:
[*] ls
[*] id_rsaid_rsa.pubknown_hosts
[*] (4).将公钥copy给自己!
[*] ssh-copy-id root@Hadoop(ssh-copy-id Hadoop有什么区别?)
[*] 然后可以查看目录下:
[*] authorized_keys
[*]
[*] 格式化集群:
[*] hdfs namenode -format
[*] 格式化查看日志:
[*] 17/02/17 16:18:30 INFO common.Storage: Storage directory
[*] /tmp/hadoop-root/dfs/name has been successfully formatted
[*] 因为没配置指定的dfs目录(元数据和数据目录:name和data),所以name和data在Linux系统的tmp目录下:
[*]
[*] 启动集群:
[*] 启动hdfs模块:
[*] ./start-dfs.sh
[*]
[*] 看hdfs启动进程:jps
[*] 2608 DataNode
[*] 2480 NameNode
[*] 2771 SecondaryNameNode
[*]
[*] 启动yarn模块:
[*] ./start-yarn.sh
[*] 看hdfs启动进程:jps
[*] 2958 ResourceManager
[*] 3055 NodeManager
[*]
[*] 上传文件到hdfs:
[*] ./hadoop fs -put /file /
[*]
[*] /tmp/hadoop-root/dfs/目录下多出一个data目录,存放数据块
[*]
[*] /tmp/hadoop-root/dfs/name目录存放的是元数据!
[*]
[*]
[*] 查看webUI:
[*] http://192.168.57.2:50070/
[*]
[*] 192.168.57.2是namenode的IP地址
[*]
[*]
[*] 配置hadoop环境变量:
[*] export HADOOP_HOME=/opt/hadoop-2.7.1
[*] export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
[*] 刷新:
[*] source /etc/profile
[*]
[*] 此后就可以在任何路径下使用hadoop下的bin和sbin的脚本!
页:
[1]