设为首页 收藏本站
查看: 686|回复: 0

[经验分享] hadoop yarn-distributedshell

[复制链接]

尚未签到

发表于 2016-12-4 10:07:12 | 显示全部楼层 |阅读模式
  this application is introduced to run shell command in distributed nodes(containers) as it named,so it's is ealy and let's to go ahead.
  1.run 'ls' command in containers
  2.which path does that command run on ?
  3.how to run meaningful commands depend on nodes
  1.run 'ls' command in containers
  

hadoop jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.5.1.jar org.apache.hadoop.yarn.applications.distributedshell.Client -jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.5.1.jar -shell_command ls -num_containers 1 -container_memory 300 -master_memory 400
  so the command 'ls' will run on any containers .and the result will like this:

more userlogs/application_1433385109839_0001/container_1433385109839_0001_01_000002/stdout
container_tokens
default_container_executor.sh
launch_container.sh
tmp

  why this file contains these content?u can lookk into the <nodemanager.log> 

2015-06-04 15:55:10,424 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://localhost:9000/user/userxx/DistributedShell/application_1433403689317_0001/AppMaster.jar(->/usr/local/hadoop/data-2.5.1/tmp/nm-local-dir/usercache/userxx/appcache/application_1433403689317_0001/filecache/10/AppMaster.jar) transitioned from DOWNLOADING to LOCALIZED
2015-06-04 15:55:10,502 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://localhost:9000/user/userxx/DistributedShell/application_1433403689317_0001/shellCommands(->/usr/local/hadoop/data-2.5.1/tmp/nm-local-dir/usercache/userxx/appcache/application_1433403689317_0001/filecache/11/shellCommands) transitioned from DOWNLOADING to LOCALIZED
2015-06-04 15:55:10,644 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: launchContainer: [nice, -n, 0, bash, /usr/local/hadoop/data-2.5.1/tmp/nm-local-dir/usercache/userxx/appcache/application_1433403689317_0001/container_1433403689317_0001_01_000001/default_container_executor.sh]
  u will see there is a file named 'defaultc_container_executor.sh' placed in the working dir(current container name).so the result from this command is correct.
  2.which path does that command run on ?
  yes,the result is absoulte right,but how to verify to current working dir is lied in 'container_1433385109839_0001_01_000001'?
  of course,it 's simple too,u can use 'pwd' instead of 'ls' for the shell_command param.

hadoop jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.5.1.jar org.apache.hadoop.yarn.applications.distributedshell.Client -jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.5.1.jar -shell_command pwd -num_containers 1 -container_memory 300 -master_memory 400
  now ,check out the stdout file,the result will like this:

/usr/local/hadoop/data-2.5.1/tmp/nm-local-dir/usercache/userxx/appcache/application_1433403689317_0002/container_1433403689317_0002_01_000002

  but this time,the dir is bit differences from point 1,as this is the second app;) 
  3.how to run meaningful commands depend on nodes
  but u if want to use a *custom script*(use some params in command params) to run on *node-specified*(ie different result for different nodes),u can use a script file to achieve this:

hadoop jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.5.1.jar org.apache.hadoop.yarn.applications.distributedshell.Client -jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.5.1.jar -shell_script ls-command.sh -num_containers 1 -container_memory 300 -master_memory 400
  and the file 'ls-command.sh' is simple: 

ls -al /tmp/
  yep,this file must be alllowed to be executable,so do it prior to run this command: 

chmod +x ls-command.sh
      
  appendix:
  A.  from the <nodemanager.log>,we found this info:

2015-06-04 15:55:17,223 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1433403689317_0001 transitioned from RUNNING to APPLICATION_RESOURCES_CLEANINGUP
2015-06-04 15:55:17,223 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /usr/local/hadoop/data-2.5.1/tmp/nm-local-dir/usercache/userxx/appcache/application_1433403689317_0001

  so if u check out the final dir appache,nothing will be there:

ll /usr/local/hadoop/data-2.5.1/tmp/nm-local-dir/usercache/userxx/appcache/
total 0

  B.the AM is responsible for setupping the containers.yeah,finally the NM will startup the containers

more userlogs/application_1433385109839_0001/container_1433385109839_0001_01_000001/AppMaster.stderr
15/06/04 12:26:09 INFO distributedshell.ApplicationMaster: Initializing ApplicationMaster
15/06/04 12:26:09 INFO distributedshell.ApplicationMaster: Application master for app, appId=1, clustertimestamp=1433385109839, attemptId=1
2015-06-04 12:26:09.755 java[1261:1903] Unable to load realm info from SCDynamicStore
15/06/04 12:26:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/06/04 12:26:10 INFO impl.TimelineClientImpl: Timeline service is not enabled
15/06/04 12:26:10 INFO distributedshell.ApplicationMaster: Starting ApplicationMaster
15/06/04 12:26:10 INFO distributedshell.ApplicationMaster: Executing with tokens:
15/06/04 12:26:10 INFO distributedshell.ApplicationMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (org.apache.hadoop.yarn.security.AMRMTokenIdentifier@7950d786)
15/06/04 12:26:10 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8030
15/06/04 12:26:10 INFO impl.NMClientAsyncImpl: Upper bound of the thread pool size is 500
15/06/04 12:26:10 INFO impl.ContainerManagementProtocolProxy: yarn.client.max-nodemanagers-proxies : 500
15/06/04 12:26:10 INFO distributedshell.ApplicationMaster: Max mem capabililty of resources in this cluster 8192
15/06/04 12:26:10 INFO distributedshell.ApplicationMaster: Max vcores capabililty of resources in this cluster 32
15/06/04 12:26:10 INFO distributedshell.ApplicationMaster: Received 0 previous AM's running containers on AM registration.
15/06/04 12:26:10 INFO distributedshell.ApplicationMaster: Requested container ask: Capability[<memory:300, vCores:1>]Priority[0]
15/06/04 12:26:10 INFO distributedshell.ApplicationMaster: Requested container ask: Capability[<memory:300, vCores:1>]Priority[0]
15/06/04 12:26:12 INFO impl.AMRMClientImpl: Received new token for : localhost:52226
15/06/04 12:26:12 INFO distributedshell.ApplicationMaster: Got response from RM for container ask, allocatedCnt=1
15/06/04 12:26:12 INFO distributedshell.ApplicationMaster: Launching shell command on a new container., containerId=container_1433385109839_0001_01_000002, containerNode=localhost:52226, containerNodeURI=localhost:8042, containerResourceMemory1024, containerResourceVirtualCores1
15/06/04 12:26:12 INFO distributedshell.ApplicationMaster: Setting up container launch container for containerid=container_1433385109839_0001_01_000002
15/06/04 12:26:12 INFO impl.NMClientAsyncImpl: Processing Event EventType: START_CONTAINER for Container container_1433385109839_0001_01_000002
15/06/04 12:26:12 INFO impl.ContainerManagementProtocolProxy: Opening proxy : localhost:52226
15/06/04 12:26:12 INFO impl.NMClientAsyncImpl: Processing Event EventType: QUERY_CONTAINER for Container container_1433385109839_0001_01_000002
15/06/04 12:26:13 INFO distributedshell.ApplicationMaster: Got response from RM for container ask, completedCnt=1
15/06/04 12:26:13 INFO distributedshell.ApplicationMaster: Got container status for containerID=container_1433385109839_0001_01_000002, state=COMPLETE, exitStatus=0, diagnostics=
15/06/04 12:26:13 INFO distributedshell.ApplicationMaster: Container completed successfully., containerId=container_1433385109839_0001_01_000002
15/06/04 12:26:13 INFO distributedshell.ApplicationMaster: Got response from RM for container ask, allocatedCnt=1
15/06/04 12:26:13 INFO distributedshell.ApplicationMaster: Launching shell command on a new container., containerId=container_1433385109839_0001_01_000003, containerNode=localhost:52226, containerNodeURI=localhost:8042, containerResourceMemory1024, containerResourceVirtualCores1
15/06/04 12:26:13 INFO distributedshell.ApplicationMaster: Setting up container launch container for containerid=container_1433385109839_0001_01_000003
15/06/04 12:26:13 INFO impl.NMClientAsyncImpl: Processing Event EventType: START_CONTAINER for Container container_1433385109839_0001_01_000003
15/06/04 12:26:13 INFO impl.NMClientAsyncImpl: Processing Event EventType: QUERY_CONTAINER for Container container_1433385109839_0001_01_000003
15/06/04 12:26:14 INFO distributedshell.ApplicationMaster: Got response from RM for container ask, completedCnt=1
15/06/04 12:26:14 INFO distributedshell.ApplicationMaster: Got container status for containerID=container_1433385109839_0001_01_000003, state=COMPLETE, exitStatus=0, diagnostics=
15/06/04 12:26:14 INFO distributedshell.ApplicationMaster: Container completed successfully., containerId=container_1433385109839_0001_01_000003
15/06/04 12:26:14 INFO distributedshell.ApplicationMaster: Application completed. Stopping running containers
15/06/04 12:26:14 INFO impl.ContainerManagementProtocolProxy: Closing proxy : localhost:52226
15/06/04 12:26:14 INFO distributedshell.ApplicationMaster: Application completed. Signalling finish to RM
15/06/04 12:26:14 INFO impl.AMRMClientImpl: Waiting for application to be successfully unregistered.
15/06/04 12:26:15 INFO distributedshell.ApplicationMaster: Application Master completed successfully. exiting
     and always the AM will start previously at first container then others.
  C.questions:my macbook pro is configured by 8g ram and i5(2.4g) two cores cpu,but i found i got a 32 vcores from above:

15/06/04 12:26:10 INFO distributedshell.ApplicationMaster: Max mem capabililty of resources in this cluster 8192
15/06/04 12:26:10 INFO distributedshell.ApplicationMaster: Max vcores capabililty of resources in this cluster 32

  anyone knows that?so i will dig into it tomorrow.. 
     after i recreated a new job on a big cluster(32g mem,8 cpus),these info were kept the same,so i thought these are the config values set in code or xml.
    today,i dig into 'CapacityScheduler#getMaximumAllocation()'


  public Resource getMaximumAllocation() {
int maximumMemory = getInt(
YarnConfiguration.RM_SCHEDULER_MAXIMUM_ALLOCATION_MB,
YarnConfiguration.DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_MB);
int maximumCores = getInt(
YarnConfiguration.RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES,
YarnConfiguration.DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES);
return Resources.createResource(maximumMemory, maximumCores);
}
  public Resource getMinimumAllocation() {
int minimumMemory = getInt(
YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_MB,
YarnConfiguration.DEFAULT_RM_SCHEDULER_MINIMUM_ALLOCATION_MB);
int minimumCores = getInt(
YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_VCORES,
YarnConfiguration.DEFAULT_RM_SCHEDULER_MINIMUM_ALLOCATION_VCORES);
return Resources.createResource(minimumMemory, minimumCores);
}
    
casepropertydefault in codedefault in xmldescription 
maxxx.scheduler.maximum-allocation-mb
8g8g  max ram per container.
The maximum allocation for every container request at the RM,
    in MBs. Memory requests higher than this won't take effect,
    and will get capped to this value

 
 xx.scheduler.maximum-allocation-vcores
4cores32 coresmax vcores per container.
The maximum allocation for every container request at the RM,
    in terms of virtual CPU cores. Requests higher than this won't take effect,
    and will get capped to this value

 
minxx.scheduler.minimum-allocation-mb
1g1g  
 xx.scheduler.minimum-allocation-vcore1core1core  
   of course ,there are some questions lied there:
  1.if a node configed 4g,and sure a max-allocation-mb should be less or equals than 4g,but now,my task need 5g to run on it,how about it?i think this node will never run any tasks.so a fix resolution is necessary,e.g:

// A resource ask cannot exceed the max.
if (amMemory > maxMem) {
LOG.info("AM memory specified above max threshold of cluster. Using max value."
+ ", specified=" + amMemory
+ ", max=" + maxMem);
amMemory = maxMem;
}
  D.container id does not restrictly follow the app attempt id But app id
  container id

container_1433385109839_0001_01_000003
  app attempt id

application_1433385109839_0001_00001
  app id

application_1433385109839_0001
  since one app maybe contain multi attempts,so the container must bind to app id instead of attempt id for umbilical relationship.
  ref:
  http://dongxicheng.org/mapreduce-nextgen/how-to-run-distributedshell/

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.iyunv.com/thread-309424-1-1.html 上篇帖子: hadoop combine 规约 下篇帖子: hadoop on windows with Eclipse
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表