设为首页 收藏本站
查看: 496|回复: 0

[经验分享] hadoop 集群安装及验证

[复制链接]

尚未签到

发表于 2016-12-4 11:14:32 | 显示全部楼层 |阅读模式
 
一、上传hadoop包到master机器/usr目录
版本:hadoop-1.2.1.tar.gz
解压:
 

tar -zxvf hadoop-1.2.1.tar.gz
 当前目录产出hadoop-1.2.1目录,进去创建tmp目录备用:
 
 

[iyunv@master hadoop-1.2.1]# mkdir tmp
 返回usr目录,赋给hadoop用户hadoop-1.2.1读写权限
 
 

[iyunv@master usr]# chown -R hadoop:hadoop hadoop-1.2.1/
 插曲:在后面操作时,是赋完hadoop目录权限后,再建立的tmp目录,所以格式化namenode时,出现错误:
 
 

[hadoop@master conf]$ hadoop namenode -format
Warning: $HADOOP_HOME is deprecated.
13/09/08 00:33:06 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = master.hadoop/192.168.70.101
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 1.2.1
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG:   java = 1.6.0_45
************************************************************/
13/09/08 00:33:06 INFO util.GSet: Computing capacity for map BlocksMap
13/09/08 00:33:06 INFO util.GSet: VM type       = 32-bit
13/09/08 00:33:06 INFO util.GSet: 2.0% max memory = 1013645312
13/09/08 00:33:06 INFO util.GSet: capacity      = 2^22 = 4194304 entries
13/09/08 00:33:06 INFO util.GSet: recommended=4194304, actual=4194304
13/09/08 00:33:06 INFO namenode.FSNamesystem: fsOwner=hadoop
13/09/08 00:33:06 INFO namenode.FSNamesystem: supergroup=supergroup
13/09/08 00:33:06 INFO namenode.FSNamesystem: isPermissionEnabled=true
13/09/08 00:33:06 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
13/09/08 00:33:06 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
13/09/08 00:33:06 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0
13/09/08 00:33:06 INFO namenode.NameNode: Caching file names occuring more than 10 times
13/09/08 00:33:07 ERROR namenode.NameNode: java.io.IOException: Cannot create directory /usr/hadoop-1.2.1/tmp/dfs/name/current
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.clearDirectory(Storage.java:294)
at org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1337)
at org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1356)
at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1261)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1467)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1488)
13/09/08 00:33:07 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master.hadoop/192.168.70.101
************************************************************/
[hadoop@master conf]$
 修正权限后解决。
 
二、配置hadoop环境变量(root) master slave都做:
 

[iyunv@master conf]# vi /etc/profile
 
 

HADOOP_HOME=/usr/hadoop-1.2.1
export HADOOP_HOME
PATH=$PATH:$HADOOP_HOME/bin
export PATH
 加载环境变量:
 
 

[iyunv@master conf]# source /etc/profile
 测试环境变量:
 
 

[iyunv@master conf]# hadoop
Warning: $HADOOP_HOME is deprecated.
Usage: hadoop [--config confdir] COMMAND
where COMMAND is one of:
namenode -format     format the DFS filesystem
....
 done
 
 
三、修改hadoop JAVA_HOME路径:
 

[iyunv@slave01 conf]# vi hadoop-env.sh
 
 
 

# The java implementation to use.  Required.
export JAVA_HOME=/usr/jdk1.6.0_45
 
 
四、修改core-site.xml配置
 

<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/hadoop-1.2.1/tmp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://master.hadoop:9000</value>
</property>
</configuration>
 
 
五、修改hdfs-site.xml 
 
 

[hadoop@master conf]$ vi hdfs-site.xml
 
 
 

<configuration>
<property>
<name>dfs.data.dir</name>
<value>/usr/hadoop-1.2.1/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
  
 
六、修改mapred-site.xml
 

[hadoop@master conf]$ vi mapred-site.xml
 
 
 

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master.hadoop:9001</value>
</property>
</configuration>
 
七、修改 masters和slaves
 

[hadoop@master conf]$ vi masters
 添加hostname或者IP

master.hadoop
 

[hadoop@master conf]$ vi slaves
 

slave01.hadoop
slave02.hadoop
 八、将修改好的hadoop分发给slave,由于slave节点hadoop用户还没有usr目录下的写权限,所以目的主机用root,源主机无所谓
 
 

[iyunv@master usr]#
[iyunv@master usr]# scp -r hadoop-1.2.1/ root@slave01.hadoop:/usr
...
[iyunv@master usr]#
[iyunv@master usr]# scp -r hadoop-1.2.1/ root@slave02.hadoop:/usr
 然后,slave修改hadoop-1.2.1目录权限
 
九、格式化HDFS文件系统
 

[hadoop@master usr]$
[hadoop@master usr]$ hadoop namenode -format
Warning: $HADOOP_HOME is deprecated.
14/10/22 07:26:09 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = master.hadoop/192.168.1.100
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 1.2.1
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG:   java = 1.6.0_45
************************************************************/
14/10/22 07:26:09 INFO util.GSet: Computing capacity for map BlocksMap
14/10/22 07:26:09 INFO util.GSet: VM type       = 32-bit
14/10/22 07:26:09 INFO util.GSet: 2.0% max memory = 1013645312
14/10/22 07:26:09 INFO util.GSet: capacity      = 2^22 = 4194304 entries
14/10/22 07:26:09 INFO util.GSet: recommended=4194304, actual=4194304
14/10/22 07:26:09 INFO namenode.FSNamesystem: fsOwner=hadoop
14/10/22 07:26:09 INFO namenode.FSNamesystem: supergroup=supergroup
14/10/22 07:26:09 INFO namenode.FSNamesystem: isPermissionEnabled=true
14/10/22 07:26:09 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
14/10/22 07:26:09 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
14/10/22 07:26:09 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0
14/10/22 07:26:09 INFO namenode.NameNode: Caching file names occuring more than 10 times
14/10/22 07:26:09 INFO common.Storage: Image file /usr/hadoop/tmp/dfs/name/current/fsimage of size 112 bytes saved in 0 seconds.
14/10/22 07:26:09 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/usr/hadoop/tmp/dfs/name/current/edits
14/10/22 07:26:09 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/usr/hadoop/tmp/dfs/name/current/edits
14/10/22 07:26:09 INFO common.Storage: Storage directory /usr/hadoop/tmp/dfs/name has been successfully formatted.
14/10/22 07:26:09 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master.hadoop/192.168.1.100
************************************************************/
[hadoop@master usr]$

 
 出现.......successfully formatted 为成功,见第一步的插曲
 
 
十、启动hadoop
启动器,先关闭iptables(master、slave都要关闭),不然执行任务可能出错
 

[iyunv@master usr]# service iptables stop
iptables: Flushing firewall rules: [  OK  ]
iptables: Setting chains to policy ACCEPT: filter [  OK  ]
iptables: Unloading modules: [  OK  ]
[iyunv@master usr]#
 
(slave忘记关闭防火墙)插曲:

[hadoop@master hadoop-1.2.1]$ hadoop jar hadoop-examples-1.2.1.jar pi 10 100
Warning: $HADOOP_HOME is deprecated.
Number of Maps  = 10
Samples per Map = 100
13/09/08 02:17:05 INFO hdfs.DFSClient: Exception in createBlockOutputStream 192.168.70.102:50010 java.net.NoRouteToHostException: No route to host
13/09/08 02:17:05 INFO hdfs.DFSClient: Abandoning blk_9160013073143341141_4460
13/09/08 02:17:05 INFO hdfs.DFSClient: Excluding datanode 192.168.70.102:50010
13/09/08 02:17:05 INFO hdfs.DFSClient: Exception in createBlockOutputStream 192.168.70.103:50010 java.net.NoRouteToHostException: No route to host
13/09/08 02:17:05 INFO hdfs.DFSClient: Abandoning blk_-1734085534405596274_4461
13/09/08 02:17:05 INFO hdfs.DFSClient: Excluding datanode 192.168.70.103:50010
13/09/08 02:17:05 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/hadoop/PiEstimator_TMP_3_141592654/in/part0 could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)
 关闭后解决,另外,配置尽量用IP吧。
启动
 

[iyunv@master usr]# su hadoop
[hadoop@master usr]$ start-all.sh
Warning: $HADOOP_HOME is deprecated.
starting namenode, logging to /usr/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-namenode-master.hadoop.out
slave01.hadoop: starting datanode, logging to /usr/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-datanode-slave01.hadoop.out
slave02.hadoop: starting datanode, logging to /usr/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-datanode-slave02.hadoop.out
The authenticity of host 'master.hadoop (192.168.70.101)' can't be established.
RSA key fingerprint is 6c:e0:d7:22:92:80:85:fb:a6:d6:a4:8f:75:b0:96:7e.
Are you sure you want to continue connecting (yes/no)? yes
master.hadoop: Warning: Permanently added 'master.hadoop,192.168.70.101' (RSA) to the list of known hosts.
master.hadoop: starting secondarynamenode, logging to /usr/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-secondarynamenode-master.hadoop.out
starting jobtracker, logging to /usr/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-jobtracker-master.hadoop.out
slave02.hadoop: starting tasktracker, logging to /usr/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktracker-slave02.hadoop.out
slave01.hadoop: starting tasktracker, logging to /usr/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktracker-slave01.hadoop.out
[hadoop@master usr]$
 从日志看出,启动过程:namenode(master)----> datanode(slave01、slave02)---->secondarynamenode(master)-----> jobtracker(master)-----> 最后启动tasktracker(slave01、slave02)
 
十一、验证
查看hadoop进程,master、slave节点分别用jps
master:
 

[hadoop@master tmp]$ jps
6009 Jps
5560 SecondaryNameNode
5393 NameNode
5627 JobTracker
[hadoop@master tmp]$
 slave01:
 
 

[hadoop@slave01 tmp]$ jps
3855 Jps
3698 TaskTracker
3636 DataNode
 slave02:
 
 

[iyunv@slave02 tmp]# jps
3628 TaskTracker
3748 Jps
3567 DataNode
[iyunv@slave02 tmp]#
 查看集群状态:hadoop dfsadmin -report
 
 

[hadoop@master tmp]$ hadoop dfsadmin -report
Warning: $HADOOP_HOME is deprecated.
Configured Capacity: 14174945280 (13.2 GB)
Present Capacity: 7577288704 (7.06 GB)
DFS Remaining: 7577231360 (7.06 GB)
DFS Used: 57344 (56 KB)
DFS Used%: 0%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 2 (2 total, 0 dead)
Name: 192.168.70.103:50010
Decommission Status : Normal
Configured Capacity: 7087472640 (6.6 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 3298820096 (3.07 GB)
DFS Remaining: 3788623872(3.53 GB)
DFS Used%: 0%
DFS Remaining%: 53.46%
Last contact: Sun Sep 08 01:19:18 PDT 2013

Name: 192.168.70.102:50010
Decommission Status : Normal
Configured Capacity: 7087472640 (6.6 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 3298836480 (3.07 GB)
DFS Remaining: 3788607488(3.53 GB)
DFS Used%: 0%
DFS Remaining%: 53.45%
Last contact: Sun Sep 08 01:19:17 PDT 2013
[hadoop@master tmp]$
 集群管理页面:master IP
 
http://192.168.70.101:50030
http://192.168.70.101:50070/
 
十二、执行任务,算个圆周率

[hadoop@master hadoop-1.2.1]$ hadoop jar hadoop-examples-1.2.1.jar pi 10 100
第一个参数10:表示运行10次map任务
第二个参数100:表示每个map取样的个数 
正常结果:

[hadoop@master hadoop-1.2.1]$ hadoop jar hadoop-examples-1.2.1.jar pi 10 100
Warning: $HADOOP_HOME is deprecated.
Number of Maps  = 10
Samples per Map = 100
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Starting Job
13/09/08 02:21:50 INFO mapred.FileInputFormat: Total input paths to process : 10
13/09/08 02:21:52 INFO mapred.JobClient: Running job: job_201309080221_0001
13/09/08 02:21:53 INFO mapred.JobClient:  map 0% reduce 0%
13/09/08 02:24:06 INFO mapred.JobClient:  map 10% reduce 0%
13/09/08 02:24:07 INFO mapred.JobClient:  map 20% reduce 0%
13/09/08 02:24:21 INFO mapred.JobClient:  map 30% reduce 0%
13/09/08 02:24:28 INFO mapred.JobClient:  map 40% reduce 0%
13/09/08 02:24:31 INFO mapred.JobClient:  map 50% reduce 0%
13/09/08 02:24:32 INFO mapred.JobClient:  map 60% reduce 0%
13/09/08 02:24:38 INFO mapred.JobClient:  map 70% reduce 0%
13/09/08 02:24:41 INFO mapred.JobClient:  map 80% reduce 13%
13/09/08 02:24:44 INFO mapred.JobClient:  map 80% reduce 23%
13/09/08 02:24:45 INFO mapred.JobClient:  map 100% reduce 23%
13/09/08 02:24:47 INFO mapred.JobClient:  map 100% reduce 26%
13/09/08 02:24:53 INFO mapred.JobClient:  map 100% reduce 100%
13/09/08 02:24:54 INFO mapred.JobClient: Job complete: job_201309080221_0001
13/09/08 02:24:54 INFO mapred.JobClient: Counters: 30
13/09/08 02:24:54 INFO mapred.JobClient:   Job Counters
13/09/08 02:24:54 INFO mapred.JobClient:     Launched reduce tasks=1
13/09/08 02:24:54 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=638017
13/09/08 02:24:54 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/09/08 02:24:54 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/09/08 02:24:54 INFO mapred.JobClient:     Launched map tasks=10
13/09/08 02:24:54 INFO mapred.JobClient:     Data-local map tasks=10
13/09/08 02:24:54 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=44458
13/09/08 02:24:54 INFO mapred.JobClient:   File Input Format Counters
13/09/08 02:24:54 INFO mapred.JobClient:     Bytes Read=1180
13/09/08 02:24:54 INFO mapred.JobClient:   File Output Format Counters
13/09/08 02:24:54 INFO mapred.JobClient:     Bytes Written=97
13/09/08 02:24:54 INFO mapred.JobClient:   FileSystemCounters
13/09/08 02:24:54 INFO mapred.JobClient:     FILE_BYTES_READ=226
13/09/08 02:24:54 INFO mapred.JobClient:     HDFS_BYTES_READ=2460
13/09/08 02:24:54 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=623419
13/09/08 02:24:54 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=215
13/09/08 02:24:54 INFO mapred.JobClient:   Map-Reduce Framework
13/09/08 02:24:54 INFO mapred.JobClient:     Map output materialized bytes=280
13/09/08 02:24:54 INFO mapred.JobClient:     Map input records=10
13/09/08 02:24:54 INFO mapred.JobClient:     Reduce shuffle bytes=280
13/09/08 02:24:54 INFO mapred.JobClient:     Spilled Records=40
13/09/08 02:24:54 INFO mapred.JobClient:     Map output bytes=180
13/09/08 02:24:54 INFO mapred.JobClient:     Total committed heap usage (bytes)=1414819840
13/09/08 02:24:54 INFO mapred.JobClient:     CPU time spent (ms)=377130
13/09/08 02:24:54 INFO mapred.JobClient:     Map input bytes=240
13/09/08 02:24:54 INFO mapred.JobClient:     SPLIT_RAW_BYTES=1280
13/09/08 02:24:54 INFO mapred.JobClient:     Combine input records=0
13/09/08 02:24:54 INFO mapred.JobClient:     Reduce input records=20
13/09/08 02:24:54 INFO mapred.JobClient:     Reduce input groups=20
13/09/08 02:24:54 INFO mapred.JobClient:     Combine output records=0
13/09/08 02:24:54 INFO mapred.JobClient:     Physical memory (bytes) snapshot=1473769472
13/09/08 02:24:54 INFO mapred.JobClient:     Reduce output records=0
13/09/08 02:24:54 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=4130349056
13/09/08 02:24:54 INFO mapred.JobClient:     Map output records=20
Job Finished in 184.973 seconds
Estimated value of Pi is 3.14800000000000000000
[hadoop@master hadoop-1.2.1]$
 由于未关闭slave防火墙,见第十步插曲。

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.iyunv.com/thread-309505-1-1.html 上篇帖子: Hadoop的实时分析之路 下篇帖子: Hadoop的实例测试
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表