殇帝刘玢你 发表于 2018-10-28 14:00:33

hadoop mapreduce开发实践之HDFS压缩文件(-cacheArchive)

$ chmod +x run_streaming.sh  $ ./run_streaming.sh
  rmr: DEPRECATED: Please use 'rm -r' instead.
  Deleted /output/wordcount/WordwhiteCacheArchiveFiletest
  18/02/01 17:57:00 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead.
  18/02/01 17:57:00 WARN streaming.StreamJob: -cacheArchive option is deprecated, please use -archives instead.
  18/02/01 17:57:00 WARN streaming.StreamJob: -jobconf option is deprecated, please use -D instead.
  18/02/01 17:57:00 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
  packageJobJar: [./mapper.py, ./reducer.py, /tmp/hadoop-unjar211766205758273068/] [] /tmp/streamjob9043244899616176268.jar tmpDir=null
  18/02/01 17:57:01 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
  18/02/01 17:57:01 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
  18/02/01 17:57:03 INFO mapred.FileInputFormat: Total input paths to process : 1
  18/02/01 17:57:03 INFO mapreduce.JobSubmitter: number of splits:2
  18/02/01 17:57:04 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1516345010544_0030
  18/02/01 17:57:04 INFO impl.YarnClientImpl: Submitted application application_1516345010544_0030
  18/02/01 17:57:04 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1516345010544_0030/
  18/02/01 17:57:04 INFO mapreduce.Job: Running job: job_1516345010544_0030
  18/02/01 17:57:11 INFO mapreduce.Job: Job job_1516345010544_0030 running in uber mode : false
  18/02/01 17:57:11 INFO mapreduce.Job:map 0% reduce 0%
  18/02/01 17:57:20 INFO mapreduce.Job:map 50% reduce 0%
  18/02/01 17:57:21 INFO mapreduce.Job:map 100% reduce 0%
  18/02/01 17:57:27 INFO mapreduce.Job:map 100% reduce 100%
  18/02/01 17:57:28 INFO mapreduce.Job: Job job_1516345010544_0030 completed successfully
  18/02/01 17:57:28 INFO mapreduce.Job: Counters: 49
  File System Counters
  FILE: Number of bytes read=113911
  FILE: Number of bytes written=664972
  FILE: Number of read operations=0
  FILE: Number of large read operations=0
  FILE: Number of write operations=0
  HDFS: Number of bytes read=636501
  HDFS: Number of bytes written=68
  HDFS: Number of read operations=9
  HDFS: Number of large read operations=0
  HDFS: Number of write operations=2
  Job Counters
  Launched map tasks=2
  Launched reduce tasks=1
  Data-local map tasks=2
  Total time spent by all maps in occupied slots (ms)=12584
  Total time spent by all reduces in occupied slots (ms)=4425
  Total time spent by all map tasks (ms)=12584
  Total time spent by all reduce tasks (ms)=4425
  Total vcore-milliseconds taken by all map tasks=12584
  Total vcore-milliseconds taken by all reduce tasks=4425
  Total megabyte-milliseconds taken by all map tasks=12886016
  Total megabyte-milliseconds taken by all reduce tasks=4531200
  Map-Reduce Framework
  Map input records=2866
  Map output records=14734
  Map output bytes=84437
  Map output materialized bytes=113917
  Input split bytes=198
  Combine input records=0
  Combine output records=0
  Reduce input groups=8
  Reduce shuffle bytes=113917
  Reduce input records=14734
  Reduce output records=8
  Spilled Records=29468
  Shuffled Maps =2
  Failed Shuffles=0
  Merged Map outputs=2
  GC time elapsed (ms)=390
  CPU time spent (ms)=3660
  Physical memory (bytes) snapshot=713809920
  Virtual memory (bytes) snapshot=8331399168
  Total committed heap usage (bytes)=594018304
  Shuffle Errors
  BAD_ID=0
  CONNECTION=0
  IO_ERROR=0
  WRONG_LENGTH=0
  WRONG_MAP=0
  WRONG_REDUCE=0
  File Input Format Counters
  Bytes Read=636303
  File Output Format Counters
  Bytes Written=68
  18/02/01 17:57:28 INFO streaming.StreamJob: Output directory: /output/wordcount/WordwhiteCacheArchiveFiletest

页: [1]
查看完整版本: hadoop mapreduce开发实践之HDFS压缩文件(-cacheArchive)