gxh1968 发表于 2016-12-7 07:12:49

hadoop使用过程中的一些问题

1 如何知道一个文件在HDFS上block的分布情况
http://stackoverflow.com/questions/6372060/how-to-track-which-data-block-is-in-which-data-node-in-hadoop
2 用windows 电脑向linux hadoop集群上提交job失败
org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job control
这是hadoop的bug https://issues.apache.org/jira/browse/MAPREDUCE-4052
jira上写的在2.4.0已经被fix,我们用的2.5,需要在配置文件中加入

<property>
<description>If enabled, user can submit an application cross-platform
i.e. submit an application from a Windows client to a Linux/Unix server or
vice versa.
</description>
<name>mapreduce.app-submission.cross-platform</name>
<value>false</value>
</property>

3 如何找到各种mapreduce job的日志,方便查看
http://www.iteblog.com/archives/896
4 如何指定hadoop client的用户名
根据UserGroupInformation的源码分析,可以设置HADOOP_USER_NAME环境变量或者系统属性

//If we don't have a kerberos user and security is disabled, check
//if user is specified in the environment or properties
if (!isSecurityEnabled() && (user == null)) {
String envUser = System.getenv(HADOOP_USER_NAME);
if (envUser == null) {
envUser = System.getProperty(HADOOP_USER_NAME);
}
user = envUser == null ? null : new User(envUser);
}
页: [1]
查看完整版本: hadoop使用过程中的一些问题