ts2009 发表于 2018-10-29 09:28:17

hadoop集群环境搭建

  hadoop大数据集群环境搭建步骤----安装伪分布式
  所需软件: vmware workstation 11.0
  jdk1.7.0_67
  hadoop-2.7.3
  filezilla FTP工具
  开始搭建步骤:

[*]  先安装一台linux服务器,(此步忽略) 需要的童鞋请到网上搜索安装linux系统
[*]  关防火墙
[*]  service iptables stop
[*]  2.设置IP地址
[*]  vi /etc/sysconfig/network-scripts/ifcfg-eth0
[*]  或者图像化修改!
[*]  3.设置network文件hosts映射文件
[*]  vi /etc/hosts
[*]  vi /etc/sysconfig/network
[*]  4.安装jdk
[*]  上传JDK解压
[*]  配置环境变量:
[*]  vi /etc/profile
[*]  source /etc/profile
[*]  5.安装hadoop
[*]  上传 hadoop-2.7.3.tar.gz
[*]  解压
[*]  6.配置hadoop:
[*]  注意:配置过程可以参考:
[*]  离线开发文档:
[*]  D:\hadoop\tools\hadoop-2.7.3\hadoop-2.7.3\share\doc\hadoop\index.html
[*]  在线文档:
[*]  配置:
[*]  core-site.xml:
[*]  
[*]  
[*]  fs.defaultFS
[*]  hdfs://Hadoop:9000端口:8020
[*]  
[*]  
[*]  fs.defaultFS
[*]  hdfs://hadoop-yarn.beicai.com:8020
[*]  
[*]  
[*]  hadoop.tmp.dir
[*]  /opt/modules/hadoop-2.5.0/data/tmp
[*]  
[*]  hdfs-site.xml:
[*]  
[*]  
[*]  dfs.replication
[*]  1
[*]  
[*]
[*]  先把mapred-site.xml.template 重命名为mapred-site.xml
[*]  maperd-site.xml:
[*]  
[*]  
[*]  mapreduce.framework.name
[*]  yarn
[*]  
[*]
[*]  yarn-site.xml:
[*]  
[*]  
[*]  yarn.nodemanager.aux-services
[*]  mapreduce_shuffle
[*]  
[*]
[*]  hadoop-env.sh:
[*]  配置jdk:
[*]  export JAVA_HOME=/opt/jdk
[*]
[*]
[*]  配置ssh免登陆:
[*]  (1).cd /root/.ssh/
[*]  (2).生成rsa秘钥:
[*]  ssh-keygen -t rsa   一路回车!!!
[*]  (3).查看秘钥:
[*]  ls
[*]  id_rsaid_rsa.pubknown_hosts
[*]  (4).将公钥copy给自己!
[*]  ssh-copy-id root@Hadoop(ssh-copy-id Hadoop有什么区别?)
[*]  然后可以查看目录下:
[*]  authorized_keys
[*]
[*]  格式化集群:
[*]  hdfs namenode -format
[*]  格式化查看日志:
[*]  17/02/17 16:18:30 INFO common.Storage: Storage directory
[*]  /tmp/hadoop-root/dfs/name has been successfully formatted
[*]  因为没配置指定的dfs目录(元数据和数据目录:name和data),所以name和data在Linux系统的tmp目录下:
[*]
[*]  启动集群:
[*]  启动hdfs模块:
[*]  ./start-dfs.sh
[*]
[*]  看hdfs启动进程:jps
[*]  2608 DataNode
[*]  2480 NameNode
[*]  2771 SecondaryNameNode
[*]
[*]  启动yarn模块:
[*]  ./start-yarn.sh
[*]  看hdfs启动进程:jps
[*]  2958 ResourceManager
[*]  3055 NodeManager
[*]
[*]  上传文件到hdfs:
[*]  ./hadoop fs -put /file /
[*]
[*]  /tmp/hadoop-root/dfs/目录下多出一个data目录,存放数据块
[*]
[*]  /tmp/hadoop-root/dfs/name目录存放的是元数据!
[*]
[*]
[*]  查看webUI:
[*]  http://192.168.57.2:50070/
[*]
[*]  192.168.57.2是namenode的IP地址
[*]
[*]
[*]  配置hadoop环境变量:
[*]  export HADOOP_HOME=/opt/hadoop-2.7.1
[*]  export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
[*]  刷新:
[*]  source /etc/profile
[*]
[*]  此后就可以在任何路径下使用hadoop下的bin和sbin的脚本!

页: [1]
查看完整版本: hadoop集群环境搭建