hadoop mapreduce开发实践之HDFS压缩文件（-cacheArchive）

殇帝刘玢你 发表于 2018-10-28 14:00:33

$ chmod +x run_streaming.sh　　$ ./run_streaming.sh
　　rmr: DEPRECATED: Please use 'rm -r' instead.
　　Deleted /output/wordcount/WordwhiteCacheArchiveFiletest
　　18/02/01 17:57:00 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead.
　　18/02/01 17:57:00 WARN streaming.StreamJob: -cacheArchive option is deprecated, please use -archives instead.
　　18/02/01 17:57:00 WARN streaming.StreamJob: -jobconf option is deprecated, please use -D instead.
　　18/02/01 17:57:00 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
　　packageJobJar: [./mapper.py, ./reducer.py, /tmp/hadoop-unjar211766205758273068/] [] /tmp/streamjob9043244899616176268.jar tmpDir=null
　　18/02/01 17:57:01 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
　　18/02/01 17:57:01 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
　　18/02/01 17:57:03 INFO mapred.FileInputFormat: Total input paths to process : 1
　　18/02/01 17:57:03 INFO mapreduce.JobSubmitter: number of splits:2
　　18/02/01 17:57:04 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1516345010544_0030
　　18/02/01 17:57:04 INFO impl.YarnClientImpl: Submitted application application_1516345010544_0030
　　18/02/01 17:57:04 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1516345010544_0030/
　　18/02/01 17:57:04 INFO mapreduce.Job: Running job: job_1516345010544_0030
　　18/02/01 17:57:11 INFO mapreduce.Job: Job job_1516345010544_0030 running in uber mode : false
　　18/02/01 17:57:11 INFO mapreduce.Job:map 0% reduce 0%
　　18/02/01 17:57:20 INFO mapreduce.Job:map 50% reduce 0%
　　18/02/01 17:57:21 INFO mapreduce.Job:map 100% reduce 0%
　　18/02/01 17:57:27 INFO mapreduce.Job:map 100% reduce 100%
　　18/02/01 17:57:28 INFO mapreduce.Job: Job job_1516345010544_0030 completed successfully
　　18/02/01 17:57:28 INFO mapreduce.Job: Counters: 49
　　File System Counters
　　FILE: Number of bytes read=113911
　　FILE: Number of bytes written=664972
　　FILE: Number of read operations=0
　　FILE: Number of large read operations=0
　　FILE: Number of write operations=0
　　HDFS: Number of bytes read=636501
　　HDFS: Number of bytes written=68
　　HDFS: Number of read operations=9
　　HDFS: Number of large read operations=0
　　HDFS: Number of write operations=2
　　Job Counters
　　Launched map tasks=2
　　Launched reduce tasks=1
　　Data-local map tasks=2
　　Total time spent by all maps in occupied slots (ms)=12584
　　Total time spent by all reduces in occupied slots (ms)=4425
　　Total time spent by all map tasks (ms)=12584
　　Total time spent by all reduce tasks (ms)=4425
　　Total vcore-milliseconds taken by all map tasks=12584
　　Total vcore-milliseconds taken by all reduce tasks=4425
　　Total megabyte-milliseconds taken by all map tasks=12886016
　　Total megabyte-milliseconds taken by all reduce tasks=4531200
　　Map-Reduce Framework
　　Map input records=2866
　　Map output records=14734
　　Map output bytes=84437
　　Map output materialized bytes=113917
　　Input split bytes=198
　　Combine input records=0
　　Combine output records=0
　　Reduce input groups=8
　　Reduce shuffle bytes=113917
　　Reduce input records=14734
　　Reduce output records=8
　　Spilled Records=29468
　　Shuffled Maps =2
　　Failed Shuffles=0
　　Merged Map outputs=2
　　GC time elapsed (ms)=390
　　CPU time spent (ms)=3660
　　Physical memory (bytes) snapshot=713809920
　　Virtual memory (bytes) snapshot=8331399168
　　Total committed heap usage (bytes)=594018304
　　Shuffle Errors
　　BAD_ID=0
　　CONNECTION=0
　　IO_ERROR=0
　　WRONG_LENGTH=0
　　WRONG_MAP=0
　　WRONG_REDUCE=0
　　File Input Format Counters
　　Bytes Read=636303
　　File Output Format Counters
　　Bytes Written=68
　　18/02/01 17:57:28 INFO streaming.StreamJob: Output directory: /output/wordcount/WordwhiteCacheArchiveFiletest

页: [1]

运维网's Archiver

hadoop mapreduce开发实践之HDFS压缩文件（-cacheArchive）