4. 运行
# hadoop fs -ls
12/01/31 14:04:39 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
12/01/31 14:04:39 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
Found 1 items
drwxr-xr-x - root supergroup 0 2012-01-31 13:57 /user/root/test
问题二 # hadoop fs -put ../conf input 时出现错误如下:
12/01/31 16:01:25 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
12/01/31 16:01:25 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
12/01/31 16:01:26 WARN hdfs.DFSClient: DataStreamer Exception: java.io.IOException: File /user/root/input/ssl-server.xml.example could only be replicated to 0 nodes, instead of 1
put: File /user/root/input/ssl-server.xml.example could only be replicated to 0 nodes, instead of 1
12/01/31 16:01:26 ERROR hdfs.DFSClient: Exception closing file /user/root/input/ssl-server.xml.example : java.io.IOException: File /user/root/input/ssl-server.xml.example could only be replicated to 0 nodes, instead of 1
解决方案:
这个问题是由于没有添加节点的原因,也就是说需要先启动namenode,再启动datanode,然后启动jobtracker和tasktracker。这样就不会存在这个问题了。 目前解决办法是分别启动节点#hadoop-daemon.sh start namenode #$hadoop-daemon.sh start datanode
1. 重新启动namenode
# hadoop-daemon.sh stop namenode
stopping namenode
# hadoop-daemon.sh start namenode
starting namenode, logging to /usr/hadoop-0.21.0/bin/../logs/hadoop-root-namenode-www.keli.com.out
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
2. 重新启动datanode
# hadoop-daemon.sh stop datanode
stopping datanode
# hadoop-daemon.sh start datanode
starting datanode, logging to /usr/hadoop-0.21.0/bin/../logs/hadoop-root-datanode-www.keli.com.out
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
问题三 Hadoop启动datanode时出现Unrecognized option: -jvm 和 Could not create the Java virtual machine.
[iyunv@www bin]# hadoop-daemon.sh start datanode
starting datanode, logging to /usr/hadoop-0.20.203.0/bin/../logs/hadoop-root-datanode-www.keli.com.out
Unrecognized option: -jvm
Could not create the Java virtual machine.
解决办法:
在hadoop安装目录/bin/hadoop中有如下一段shell:
CLASS='org.apache.hadoop.hdfs.server.datanode.DataNode'
if [[ $EUID -eq 0 ]]; then
HADOOP_OPTS="$HADOOP_OPTS -jvm server $HADOOP_DATANODE_OPTS"
else
HADOOP_OPTS="$HADOOP_OPTS -server $HADOOP_DATANODE_OPTS"
fi
其中的
if [[ $EUID -eq 0 ]]; then
HADOOP_OPTS="$HADOOP_OPTS -jvm server $HADOOP_DATANODE_OPTS"
如果 $EUID 为 0,什么意思呢?
有效用户标识号(EUID):该标识号负责标识以什么用户身份来给新创建的进程赋所有权、检查文件的存取权限和检查通过系统调用kill向进程发送软中断信号的许可权限。
在root用户下echo $EUID,echo结果为 0。
ok,在root下会有-jvm选项添加上去,上面说的Unrecognized option: -jvm难道就是这里产生的。
4.点击Run,运行程序。
显示信息如下:
12/11/24 17:08:59 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
12/11/24 17:08:59 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
12/11/24 17:08:59 INFO input.FileInputFormat: Total input paths to process : 2
12/11/24 17:08:59 INFO mapred.JobClient: Running job: job_local_0001
12/11/24 17:08:59 INFO input.FileInputFormat: Total input paths to process : 2
12/11/24 17:09:00 INFO mapred.MapTask: io.sort.mb = 100
12/11/24 17:09:00 INFO mapred.MapTask: data buffer = 79691776/99614720
12/11/24 17:09:00 INFO mapred.MapTask: record buffer = 262144/327680
12/11/24 17:09:00 INFO mapred.MapTask: Starting flush of map output
12/11/24 17:09:00 INFO mapred.MapTask: Finished spill 0
12/11/24 17:09:00 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
12/11/24 17:09:00 INFO mapred.LocalJobRunner:
12/11/24 17:09:00 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
12/11/24 17:09:00 INFO mapred.MapTask: io.sort.mb = 100
12/11/24 17:09:00 INFO mapred.MapTask: data buffer = 79691776/99614720
12/11/24 17:09:00 INFO mapred.MapTask: record buffer = 262144/327680
12/11/24 17:09:00 INFO mapred.MapTask: Starting flush of map output
12/11/24 17:09:00 INFO mapred.MapTask: Finished spill 0
12/11/24 17:09:00 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000001_0 is done. And is in the process of commiting
12/11/24 17:09:00 INFO mapred.LocalJobRunner:
12/11/24 17:09:00 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000001_0' done.
12/11/24 17:09:00 INFO mapred.LocalJobRunner:
12/11/24 17:09:00 INFO mapred.Merger: Merging 2 sorted segments
12/11/24 17:09:00 INFO mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 77 bytes
12/11/24 17:09:00 INFO mapred.LocalJobRunner:
12/11/24 17:09:00 INFO mapred.TaskRunner: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting
12/11/24 17:09:00 INFO mapred.LocalJobRunner:
12/11/24 17:09:00 INFO mapred.TaskRunner: Task attempt_local_0001_r_000000_0 is allowed to commit now
12/11/24 17:09:00 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to hdfs://localhost:9000/user/tgrqap6qhvnoh4d/administrator/output01
12/11/24 17:09:00 INFO mapred.LocalJobRunner: reduce > reduce
12/11/24 17:09:00 INFO mapred.TaskRunner: Task 'attempt_local_0001_r_000000_0' done.
12/11/24 17:09:01 INFO mapred.JobClient: map 100% reduce 100%
12/11/24 17:09:01 INFO mapred.JobClient: Job complete: job_local_0001
12/11/24 17:09:01 INFO mapred.JobClient: Counters: 14
12/11/24 17:09:01 INFO mapred.JobClient: FileSystemCounters
12/11/24 17:09:01 INFO mapred.JobClient: FILE_BYTES_READ=51328
12/11/24 17:09:01 INFO mapred.JobClient: HDFS_BYTES_READ=139
12/11/24 17:09:01 INFO mapred.JobClient: FILE_BYTES_WRITTEN=104570
12/11/24 17:09:01 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=41
12/11/24 17:09:01 INFO mapred.JobClient: Map-Reduce Framework
12/11/24 17:09:01 INFO mapred.JobClient: Reduce input groups=5
12/11/24 17:09:01 INFO mapred.JobClient: Combine output records=6
12/11/24 17:09:01 INFO mapred.JobClient: Map input records=4
12/11/24 17:09:01 INFO mapred.JobClient: Reduce shuffle bytes=0
12/11/24 17:09:01 INFO mapred.JobClient: Reduce output records=5
12/11/24 17:09:01 INFO mapred.JobClient: Spilled Records=12
12/11/24 17:09:01 INFO mapred.JobClient: Map output bytes=82
12/11/24 17:09:01 INFO mapred.JobClient: Combine input records=8
12/11/24 17:09:01 INFO mapred.JobClient: Map output records=8
12/11/24 17:09:01 INFO mapred.JobClient: Reduce input records=6
常见问题
(1)
执行 $ bin/hadoop start-all.sh之后,无法启动,在logs的namenode日志发现如下内容:
2011-08-03 08:43:08,068 ERRORorg.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.NullPointerException
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:136)
at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:176)
at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:206)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:240)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:434)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1153)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1162)
解决方法:此时是没有配置conf/mapred-site.xml的缘故. 配置core-site.xml文件
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
<final>true</final>
</property>
配置mapred-site.xml文件:
<property>
<name>mapred.job.tracker</name>
<value>hdfs://localhost:9001</value>
<final>true</final>
</property>
(2)执行 hadoop fs -ls
显示结果:ls: Cannot access .: No such file or directory.
这是这个目录为空所致。执行
hadoop fs -ls /
可以看到有一条结果。执行hadoop fs -mkdir hello 其中hello为文件夹名字,再执行ls命令,即可看到结果。
(3) TaskTracker无法启动,在logs中查看tasktracker日志,出现如下错误:
2011-08-03 08:46:45,750 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.io.IOException: Failed to set permissions of path: /wen/hadoop/working/dir1/ttprivate to 0700
at org.apache.hadoop.fs.RawLocalFileSystem.checkReturnValue(RawLocalFileSystem.java:525)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:499)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:318)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:183)
at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:635)
at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1328)
at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3430)
解决方式:1.确认hadoop的版本,在0.20.203版本上有这个bug,切换会0.20.2版本;2. 执行第5步,授权