Hadoop MapReduce编程 API入门系列之多个Job迭代式MapReduce运行（十二）

wzh789 发表于 2017-12-18 10:15:12

　　推荐

MapReduce分析明星微博数据
　　http://git.oschina.net/ljc520313/codeexample/tree/master/bigdata/hadoop/mapreduce/05.%E6%98%8E%E6%98%9F%E5%BE%AE%E5%8D%9A%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90?dir=1&filepath=bigdata%2Fhadoop%2Fmapreduce%2F05.%E6%98%8E%E6%98%9F%E5%BE%AE%E5%8D%9A%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90&oid=854b4300ccc9fbae894f2f8c29df3ca06193f97b&sha=79a86bf0ff190e38a133bc2446b6b4ad9490f40f
　　这篇博客，给大家，体会不一样的版本编程。

　　执行
　　2016-12-12 15:07:51,762 INFO - Initializing JVM Metrics with processName=JobTracker, sessionId=
　　2016-12-12 15:07:52,197 WARN - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.

　　2016-12-12 15:07:52,199 WARN - No job jar file set.User>　　2016-12-12 15:07:52,216 INFO - Total input paths to process : 1
　　2016-12-12 15:07:52,265 INFO - number of splits:1
　　2016-12-12 15:07:52,541 INFO - Submitting tokens for job: job_local1414008937_0001
　　2016-12-12 15:07:53,106 INFO - The url to track the job: http://localhost:8080/
　　2016-12-12 15:07:53,107 INFO - Running job: job_local1414008937_0001
　　2016-12-12 15:07:53,114 INFO - OutputCommitter set in config null
　　2016-12-12 15:07:53,128 INFO - OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
　　2016-12-12 15:07:53,203 INFO - Waiting for map tasks
　　2016-12-12 15:07:53,216 INFO - Starting task: attempt_local1414008937_0001_m_000000_0
　　2016-12-12 15:07:53,271 INFO - ProcfsBasedProcessTree currently is supported only on Linux.
　　2016-12-12 15:07:53,374 INFO -Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@65f3724c
　　2016-12-12 15:07:53,382 INFO - Processing split: file:/D:/Code/MyEclipseJavaCode/myMapReduce/data/Weibodata.txt:0+174116
　　2016-12-12 15:07:53,443 INFO - (EQUATOR) 0 kvi 26214396(104857584)
　　2016-12-12 15:07:53,443 INFO - mapreduce.task.io.sort.mb: 100
　　2016-12-12 15:07:53,443 INFO - soft limit at 83886080
　　2016-12-12 15:07:53,444 INFO - bufstart = 0; bufvoid = 104857600
　　2016-12-12 15:07:53,444 INFO - kvstart = 26214396; length = 6553600

　　2016-12-12 15:07:53,450 INFO - Map output collector>　　2016-12-12 15:07:54,110 INFO - Job job_local1414008937_0001 running in uber mode : false
　　2016-12-12 15:07:54,112 INFO -map 0% reduce 0%
　　2016-12-12 15:07:55,068 INFO -
　　2016-12-12 15:07:55,068 INFO - Starting flush of map output
　　2016-12-12 15:07:55,068 INFO - Spilling map output
　　2016-12-12 15:07:55,068 INFO - bufstart = 0; bufend = 747379; bufvoid = 104857600
　　2016-12-12 15:07:55,068 INFO - kvstart = 26214396(104857584); kvend = 26101152(104404608); length = 113245/6553600
　　count___________1065
　　2016-12-12 15:07:55,674 INFO - Finished spill 0
　　2016-12-12 15:07:55,685 INFO - Task:attempt_local1414008937_0001_m_000000_0 is done. And is in the process of committing
　　2016-12-12 15:07:55,706 INFO - map
　　2016-12-12 15:07:55,706 INFO - Task 'attempt_local1414008937_0001_m_000000_0' done.
　　2016-12-12 15:07:55,706 INFO - Finishing task: attempt_local1414008937_0001_m_000000_0
　　2016-12-12 15:07:55,707 INFO - map task executor complete.
　　2016-12-12 15:07:55,714 INFO - Waiting for reduce tasks
　　2016-12-12 15:07:55,714 INFO - Starting task: attempt_local1414008937_0001_r_000000_0
　　2016-12-12 15:07:55,727 INFO - ProcfsBasedProcessTree currently is supported only on Linux.
　　2016-12-12 15:07:55,754 INFO -Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@24a11405
　　2016-12-12 15:07:55,758 INFO - Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@12efdb85
　　2016-12-12 15:07:55,776 INFO - MergerManager: memoryLimit=1327077760, maxSingleShuffleLimit=331769440, mergeThreshold=875871360, ioSortFactor=10, memToMemMergeOutputsThreshold=10
　　2016-12-12 15:07:55,778 INFO - attempt_local1414008937_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
　　2016-12-12 15:07:55,810 INFO - localfetcher#1 about to shuffle output of map attempt_local1414008937_0001_m_000000_0 decomp: 222260 len: 222264 to MEMORY
　　2016-12-12 15:07:55,818 INFO - Read 222260 bytes from map-output for attempt_local1414008937_0001_m_000000_0

　　2016-12-12 15:07:55,863 INFO - closeInMemoryFile -> map-output of>　　2016-12-12 15:07:55,865 INFO - EventFetcher is interrupted.. Returning
　　2016-12-12 15:07:55,866 INFO - 1 / 1 copied.
　　2016-12-12 15:07:55,867 INFO - finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
　　2016-12-12 15:07:55,876 INFO - Merging 1 sorted segments

　　2016-12-12 15:07:55,876 INFO - Down to the last merge-pass, with 1 segments left of total>　　2016-12-12 15:07:55,952 INFO - Merged 1 segments, 222260 bytes to disk to satisfy reduce memory limit
　　2016-12-12 15:07:55,953 INFO - Merging 1 files, 222264 bytes from disk
　　2016-12-12 15:07:55,954 INFO - Merging 0 segments, 0 bytes from memory into reduce
　　2016-12-12 15:07:55,955 INFO - Merging 1 sorted segments

　　2016-12-12 15:07:55,987 INFO - Down to the last merge-pass, with 1 segments left of total>　　2016-12-12 15:07:55,989 INFO - 1 / 1 copied.
　　2016-12-12 15:07:55,994 INFO - mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
　　2016-12-12 15:07:56,124 INFO -map 100% reduce 0%
　　2016-12-12 15:07:56,347 INFO - Task:attempt_local1414008937_0001_r_000000_0 is done. And is in the process of committing
　　2016-12-12 15:07:56,349 INFO - 1 / 1 copied.
　　2016-12-12 15:07:56,349 INFO - Task attempt_local1414008937_0001_r_000000_0 is allowed to commit now
　　2016-12-12 15:07:56,357 INFO - Saved output of task 'attempt_local1414008937_0001_r_000000_0' to file:/D:/Code/MyEclipseJavaCode/myMapReduce/out/weibo1/_temporary/0/task_local1414008937_0001_r_000000
　　2016-12-12 15:07:56,358 INFO - reduce > reduce
　　2016-12-12 15:07:56,359 INFO - Task 'attempt_local1414008937_0001_r_000000_0' done.
　　2016-12-12 15:07:56,359 INFO - Finishing task: attempt_local1414008937_0001_r_000000_0
　　2016-12-12 15:07:56,359 INFO - Starting task: attempt_local1414008937_0001_r_000001_0
　　2016-12-12 15:07:56,365 INFO - ProcfsBasedProcessTree currently is supported only on Linux.
　　2016-12-12 15:07:56,391 INFO -Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@464d02ee
　　2016-12-12 15:07:56,392 INFO - Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@69fb7b50
　　2016-12-12 15:07:56,394 INFO - MergerManager: memoryLimit=1327077760, maxSingleShuffleLimit=331769440, mergeThreshold=875871360, ioSortFactor=10, memToMemMergeOutputsThreshold=10
　　2016-12-12 15:07:56,395 INFO - attempt_local1414008937_0001_r_000001_0 Thread started: EventFetcher for fetching Map Completion Events
　　2016-12-12 15:07:56,399 INFO - localfetcher#2 about to shuffle output of map attempt_local1414008937_0001_m_000000_0 decomp: 226847 len: 226851 to MEMORY
　　2016-12-12 15:07:56,401 INFO - Read 226847 bytes from map-output for attempt_local1414008937_0001_m_000000_0

　　2016-12-12 15:07:56,401 INFO - closeInMemoryFile -> map-output of>　　2016-12-12 15:07:56,402 INFO - EventFetcher is interrupted.. Returning
　　2016-12-12 15:07:56,402 INFO - 1 / 1 copied.
　　2016-12-12 15:07:56,402 INFO - finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
　　2016-12-12 15:07:56,407 INFO - Merging 1 sorted segments

　　2016-12-12 15:07:56,407 INFO - Down to the last merge-pass, with 1 segments left of total>　　2016-12-12 15:07:56,488 INFO - Merged 1 segments, 226847 bytes to disk to satisfy reduce memory limit
　　2016-12-12 15:07:56,488 INFO - Merging 1 files, 226851 bytes from disk
　　2016-12-12 15:07:56,489 INFO - Merging 0 segments, 0 bytes from memory into reduce
　　2016-12-12 15:07:56,489 INFO - Merging 1 sorted segments

　　2016-12-12 15:07:56,490 INFO - Down to the last merge-pass, with 1 segments left of total>　　2016-12-12 15:07:56,491 INFO - 1 / 1 copied.
　　2016-12-12 15:07:56,581 INFO - Task:attempt_local1414008937_0001_r_000001_0 is done. And is in the process of committing
　　2016-12-12 15:07:56,584 INFO - 1 / 1 copied.
　　2016-12-12 15:07:56,584 INFO - Task attempt_local1414008937_0001_r_000001_0 is allowed to commit now
　　2016-12-12 15:07:56,591 INFO - Saved output of task 'attempt_local1414008937_0001_r_000001_0' to file:/D:/Code/MyEclipseJavaCode/myMapReduce/out/weibo1/_temporary/0/task_local1414008937_0001_r_000001
　　2016-12-12 15:07:56,593 INFO - reduce > reduce
　　2016-12-12 15:07:56,593 INFO - Task 'attempt_local1414008937_0001_r_000001_0' done.
　　2016-12-12 15:07:56,593 INFO - Finishing task: attempt_local1414008937_0001_r_000001_0
　　2016-12-12 15:07:56,593 INFO - Starting task: attempt_local1414008937_0001_r_000002_0
　　2016-12-12 15:07:56,596 INFO - ProcfsBasedProcessTree currently is supported only on Linux.
　　2016-12-12 15:07:56,640 INFO -Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@36d0c62b
　　2016-12-12 15:07:56,640 INFO - Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@44824d2a
　　2016-12-12 15:07:56,641 INFO - MergerManager: memoryLimit=1327077760, maxSingleShuffleLimit=331769440, mergeThreshold=875871360, ioSortFactor=10, memToMemMergeOutputsThreshold=10
　　2016-12-12 15:07:56,643 INFO - attempt_local1414008937_0001_r_000002_0 Thread started: EventFetcher for fetching Map Completion Events
　　2016-12-12 15:07:56,648 INFO - localfetcher#3 about to shuffle output of map attempt_local1414008937_0001_m_000000_0 decomp: 224215 len: 224219 to MEMORY
　　2016-12-12 15:07:56,650 INFO - Read 224215 bytes from map-output for attempt_local1414008937_0001_m_000000_0

　　2016-12-12 15:07:56,650 INFO - closeInMemoryFile -> map-output of>　　2016-12-12 15:07:56,651 INFO - EventFetcher is interrupted.. Returning
　　2016-12-12 15:07:56,651 INFO - 1 / 1 copied.
　　2016-12-12 15:07:56,652 INFO - finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
　　2016-12-12 15:07:56,658 INFO - Merging 1 sorted segments

　　2016-12-12 15:07:56,658 INFO - Down to the last merge-pass, with 1 segments left of total>　　2016-12-12 15:07:56,675 INFO - Merged 1 segments, 224215 bytes to disk to satisfy reduce memory limit
　　2016-12-12 15:07:56,676 INFO - Merging 1 files, 224219 bytes from disk
　　2016-12-12 15:07:56,676 INFO - Merging 0 segments, 0 bytes from memory into reduce
　　2016-12-12 15:07:56,676 INFO - Merging 1 sorted segments

　　2016-12-12 15:07:56,677 INFO - Down to the last merge-pass, with 1 segments left of total>　　2016-12-12 15:07:56,678 INFO - 1 / 1 copied.
　　2016-12-12 15:07:56,711 INFO - Task:attempt_local1414008937_0001_r_000002_0 is done. And is in the process of committing
　　2016-12-12 15:07:56,714 INFO - 1 / 1 copied.
　　2016-12-12 15:07:56,714 INFO - Task attempt_local1414008937_0001_r_000002_0 is allowed to commit now
　　2016-12-12 15:07:56,725 INFO - Saved output of task 'attempt_local1414008937_0001_r_000002_0' to file:/D:/Code/MyEclipseJavaCode/myMapReduce/out/weibo1/_temporary/0/task_local1414008937_0001_r_000002
　　2016-12-12 15:07:56,726 INFO - reduce > reduce
　　2016-12-12 15:07:56,727 INFO - Task 'attempt_local1414008937_0001_r_000002_0' done.
　　2016-12-12 15:07:56,727 INFO - Finishing task: attempt_local1414008937_0001_r_000002_0
　　2016-12-12 15:07:56,727 INFO - Starting task: attempt_local1414008937_0001_r_000003_0
　　2016-12-12 15:07:56,729 INFO - ProcfsBasedProcessTree currently is supported only on Linux.
　　2016-12-12 15:07:56,749 INFO -Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@42ed705f
　　2016-12-12 15:07:56,750 INFO - Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@726c8f4c
　　2016-12-12 15:07:56,751 INFO - MergerManager: memoryLimit=1327077760, maxSingleShuffleLimit=331769440, mergeThreshold=875871360, ioSortFactor=10, memToMemMergeOutputsThreshold=10
　　2016-12-12 15:07:56,752 INFO - attempt_local1414008937_0001_r_000003_0 Thread started: EventFetcher for fetching Map Completion Events
　　2016-12-12 15:07:56,757 INFO - localfetcher#4 about to shuffle output of map attempt_local1414008937_0001_m_000000_0 decomp: 14 len: 18 to MEMORY
　　2016-12-12 15:07:56,758 INFO - Read 14 bytes from map-output for attempt_local1414008937_0001_m_000000_0

　　2016-12-12 15:07:56,758 INFO - closeInMemoryFile -> map-output of>　　2016-12-12 15:07:56,759 INFO - EventFetcher is interrupted.. Returning
　　2016-12-12 15:07:56,759 INFO - 1 / 1 copied.
　　2016-12-12 15:07:56,759 INFO - finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
　　2016-12-12 15:07:56,764 INFO - Merging 1 sorted segments

　　2016-12-12 15:07:56,764 INFO - Down to the last merge-pass, with 1 segments left of total>　　2016-12-12 15:07:56,765 INFO - Merged 1 segments, 14 bytes to disk to satisfy reduce memory limit
　　2016-12-12 15:07:56,765 INFO - Merging 1 files, 18 bytes from disk
　　2016-12-12 15:07:56,765 INFO - Merging 0 segments, 0 bytes from memory into reduce
　　2016-12-12 15:07:56,765 INFO - Merging 1 sorted segments

　　2016-12-12 15:07:56,766 INFO - Down to the last merge-pass, with 1 segments left of total>　　2016-12-12 15:07:56,766 INFO - 1 / 1 copied.
　　count___________1065
　　2016-12-12 15:07:56,770 INFO - Task:attempt_local1414008937_0001_r_000003_0 is done. And is in the process of committing
　　2016-12-12 15:07:56,771 INFO - 1 / 1 copied.
　　2016-12-12 15:07:56,771 INFO - Task attempt_local1414008937_0001_r_000003_0 is allowed to commit now
　　2016-12-12 15:07:56,777 INFO - Saved output of task 'attempt_local1414008937_0001_r_000003_0' to file:/D:/Code/MyEclipseJavaCode/myMapReduce/out/weibo1/_temporary/0/task_local1414008937_0001_r_000003
　　2016-12-12 15:07:56,778 INFO - reduce > reduce
　　2016-12-12 15:07:56,778 INFO - Task 'attempt_local1414008937_0001_r_000003_0' done.
　　2016-12-12 15:07:56,778 INFO - Finishing task: attempt_local1414008937_0001_r_000003_0
　　2016-12-12 15:07:56,779 INFO - reduce task executor complete.
　　2016-12-12 15:07:57,127 INFO -map 100% reduce 100%
　　2016-12-12 15:07:57,137 INFO - Job job_local1414008937_0001 completed successfully
　　2016-12-12 15:07:57,186 INFO - Counters: 33
　　File System Counters
　　FILE: Number of bytes read=4937350
　　FILE: Number of bytes written=8113860
　　FILE: Number of read operations=0
　　FILE: Number of large read operations=0
　　FILE: Number of write operations=0
　　Map-Reduce Framework
　　Map input records=1065
　　Map output records=28312
　　Map output bytes=747379
　　Map output materialized bytes=673352
　　Input split bytes=127
　　Combine input records=28312
　　Combine output records=23098
　　Reduce input groups=23098
　　Reduce shuffle bytes=673352
　　Reduce input records=23098
　　Reduce output records=23098
　　Spilled Records=46196
　　Shuffled Maps =4
　　Failed Shuffles=0
　　Merged Map outputs=4
　　GC time elapsed (ms)=165
　　CPU time spent (ms)=0
　　Physical memory (bytes) snapshot=0
　　Virtual memory (bytes) snapshot=0
　　Total committed heap usage (bytes)=1672478720
　　Shuffle Errors
　　BAD_ID=0
　　CONNECTION=0
　　IO_ERROR=0
　　WRONG_LENGTH=0
　　WRONG_MAP=0
　　WRONG_REDUCE=0
　　File Input Format Counters
　　Bytes Read=174116
　　File Output Format Counters
　　Bytes Written=585532

　　执行
　　2016-12-12 15:10:36,011 INFO - Initializing JVM Metrics with processName=JobTracker, sessionId=
　　2016-12-12 15:10:36,436 WARN - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.

　　2016-12-12 15:10:36,438 WARN - No job jar file set.User>　　2016-12-12 15:10:36,892 INFO - Total input paths to process : 4
　　2016-12-12 15:10:36,959 INFO - number of splits:4
　　2016-12-12 15:10:37,215 INFO - Submitting tokens for job: job_local564512176_0001
　　2016-12-12 15:10:37,668 INFO - The url to track the job: http://localhost:8080/
　　2016-12-12 15:10:37,670 INFO - Running job: job_local564512176_0001
　　2016-12-12 15:10:37,672 INFO - OutputCommitter set in config null
　　2016-12-12 15:10:37,685 INFO - OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
　　2016-12-12 15:10:37,757 INFO - Waiting for map tasks
　　2016-12-12 15:10:37,759 INFO - Starting task: attempt_local564512176_0001_m_000000_0
　　2016-12-12 15:10:37,822 INFO - ProcfsBasedProcessTree currently is supported only on Linux.
　　2016-12-12 15:10:37,854 INFO -Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@12633e10
　　2016-12-12 15:10:37,861 INFO - Processing split: file:/D:/Code/MyEclipseJavaCode/myMapReduce/out/weibo1/part-r-00001:0+195718
　　2016-12-12 15:10:37,924 INFO - (EQUATOR) 0 kvi 26214396(104857584)
　　2016-12-12 15:10:37,924 INFO - mapreduce.task.io.sort.mb: 100
　　2016-12-12 15:10:37,925 INFO - soft limit at 83886080
　　2016-12-12 15:10:37,925 INFO - bufstart = 0; bufvoid = 104857600
　　2016-12-12 15:10:37,925 INFO - kvstart = 26214396; length = 6553600

　　2016-12-12 15:10:37,932 INFO - Map output collector>　　2016-12-12 15:10:38,401 INFO -
　　2016-12-12 15:10:38,402 INFO - Starting flush of map output
　　2016-12-12 15:10:38,402 INFO - Spilling map output
　　2016-12-12 15:10:38,402 INFO - bufstart = 0; bufend = 78968; bufvoid = 104857600
　　2016-12-12 15:10:38,402 INFO - kvstart = 26214396(104857584); kvend = 26183268(104733072); length = 31129/6553600
　　2016-12-12 15:10:38,673 INFO - Job job_local564512176_0001 running in uber mode : false
　　2016-12-12 15:10:38,676 INFO -map 0% reduce 0%
　　2016-12-12 15:10:38,724 INFO - Finished spill 0
　　2016-12-12 15:10:38,730 INFO - Task:attempt_local564512176_0001_m_000000_0 is done. And is in the process of committing
　　2016-12-12 15:10:38,744 INFO - map
　　2016-12-12 15:10:38,744 INFO - Task 'attempt_local564512176_0001_m_000000_0' done.
　　2016-12-12 15:10:38,745 INFO - Finishing task: attempt_local564512176_0001_m_000000_0
　　2016-12-12 15:10:38,745 INFO - Starting task: attempt_local564512176_0001_m_000001_0
　　2016-12-12 15:10:38,748 INFO - ProcfsBasedProcessTree currently is supported only on Linux.
　　2016-12-12 15:10:38,778 INFO -Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@43aa735f
　　2016-12-12 15:10:38,784 INFO - Processing split: file:/D:/Code/MyEclipseJavaCode/myMapReduce/out/weibo1/part-r-00002:0+193443
　　2016-12-12 15:10:38,820 INFO - (EQUATOR) 0 kvi 26214396(104857584)
　　2016-12-12 15:10:38,820 INFO - mapreduce.task.io.sort.mb: 100
　　2016-12-12 15:10:38,820 INFO - soft limit at 83886080
　　2016-12-12 15:10:38,821 INFO - bufstart = 0; bufvoid = 104857600
　　2016-12-12 15:10:38,821 INFO - kvstart = 26214396; length = 6553600

　　2016-12-12 15:10:38,822 INFO - Map output collector>　　2016-12-12 15:10:39,017 INFO -
　　2016-12-12 15:10:39,017 INFO - Starting flush of map output
　　2016-12-12 15:10:39,018 INFO - Spilling map output
　　2016-12-12 15:10:39,018 INFO - bufstart = 0; bufend = 78027; bufvoid = 104857600
　　2016-12-12 15:10:39,018 INFO - kvstart = 26214396(104857584); kvend = 26183624(104734496); length = 30773/6553600
　　2016-12-12 15:10:39,157 INFO - Finished spill 0
　　2016-12-12 15:10:39,162 INFO - Task:attempt_local564512176_0001_m_000001_0 is done. And is in the process of committing
　　2016-12-12 15:10:39,166 INFO - map
　　2016-12-12 15:10:39,166 INFO - Task 'attempt_local564512176_0001_m_000001_0' done.
　　2016-12-12 15:10:39,166 INFO - Finishing task: attempt_local564512176_0001_m_000001_0
　　2016-12-12 15:10:39,167 INFO - Starting task: attempt_local564512176_0001_m_000002_0
　　2016-12-12 15:10:39,171 INFO - ProcfsBasedProcessTree currently is supported only on Linux.
　　2016-12-12 15:10:39,219 INFO -Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@405f4f03
　　2016-12-12 15:10:39,222 INFO - Processing split: file:/D:/Code/MyEclipseJavaCode/myMapReduce/out/weibo1/part-r-00000:0+191780
　　2016-12-12 15:10:39,265 INFO - (EQUATOR) 0 kvi 26214396(104857584)
　　2016-12-12 15:10:39,265 INFO - mapreduce.task.io.sort.mb: 100
　　2016-12-12 15:10:39,265 INFO - soft limit at 83886080
　　2016-12-12 15:10:39,265 INFO - bufstart = 0; bufvoid = 104857600
　　2016-12-12 15:10:39,265 INFO - kvstart = 26214396; length = 6553600

　　2016-12-12 15:10:39,270 INFO - Map output collector>　　2016-12-12 15:10:39,311 INFO -
　　2016-12-12 15:10:39,311 INFO - Starting flush of map output
　　2016-12-12 15:10:39,311 INFO - Spilling map output
　　2016-12-12 15:10:39,311 INFO - bufstart = 0; bufend = 77478; bufvoid = 104857600
　　2016-12-12 15:10:39,312 INFO - kvstart = 26214396(104857584); kvend = 26183920(104735680); length = 30477/6553600
　　2016-12-12 15:10:39,360 INFO - Finished spill 0
　　2016-12-12 15:10:39,365 INFO - Task:attempt_local564512176_0001_m_000002_0 is done. And is in the process of committing
　　2016-12-12 15:10:39,368 INFO - map
　　2016-12-12 15:10:39,369 INFO - Task 'attempt_local564512176_0001_m_000002_0' done.
　　2016-12-12 15:10:39,369 INFO - Finishing task: attempt_local564512176_0001_m_000002_0
　　2016-12-12 15:10:39,369 INFO - Starting task: attempt_local564512176_0001_m_000003_0
　　2016-12-12 15:10:39,372 INFO - ProcfsBasedProcessTree currently is supported only on Linux.
　　2016-12-12 15:10:39,416 INFO -Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@4e5497cb
　　2016-12-12 15:10:39,419 INFO - Processing split: file:/D:/Code/MyEclipseJavaCode/myMapReduce/out/weibo1/part-r-00003:0+11
　　2016-12-12 15:10:39,461 INFO - (EQUATOR) 0 kvi 26214396(104857584)
　　2016-12-12 15:10:39,461 INFO - mapreduce.task.io.sort.mb: 100
　　2016-12-12 15:10:39,461 INFO - soft limit at 83886080
　　2016-12-12 15:10:39,461 INFO - bufstart = 0; bufvoid = 104857600
　　2016-12-12 15:10:39,462 INFO - kvstart = 26214396; length = 6553600

　　2016-12-12 15:10:39,463 INFO - Map output collector>　　2016-12-12 15:10:39,466 INFO -
　　2016-12-12 15:10:39,466 INFO - Starting flush of map output
　　2016-12-12 15:10:39,479 INFO - Task:attempt_local564512176_0001_m_000003_0 is done. And is in the process of committing
　　2016-12-12 15:10:39,482 INFO - map
　　2016-12-12 15:10:39,482 INFO - Task 'attempt_local564512176_0001_m_000003_0' done.
　　2016-12-12 15:10:39,482 INFO - Finishing task: attempt_local564512176_0001_m_000003_0
　　2016-12-12 15:10:39,482 INFO - map task executor complete.
　　2016-12-12 15:10:39,487 INFO - Waiting for reduce tasks
　　2016-12-12 15:10:39,488 INFO - Starting task: attempt_local564512176_0001_r_000000_0
　　2016-12-12 15:10:39,497 INFO - ProcfsBasedProcessTree currently is supported only on Linux.
　　2016-12-12 15:10:39,519 INFO -Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@6d565f45
　　2016-12-12 15:10:39,523 INFO - Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@1f719a8d
　　2016-12-12 15:10:39,538 INFO - MergerManager: memoryLimit=1327077760, maxSingleShuffleLimit=331769440, mergeThreshold=875871360, ioSortFactor=10, memToMemMergeOutputsThreshold=10
　　2016-12-12 15:10:39,541 INFO - attempt_local564512176_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
　　2016-12-12 15:10:39,583 INFO - localfetcher#1 about to shuffle output of map attempt_local564512176_0001_m_000002_0 decomp: 37768 len: 37772 to MEMORY
　　2016-12-12 15:10:39,589 INFO - Read 37768 bytes from map-output for attempt_local564512176_0001_m_000002_0

　　2016-12-12 15:10:39,638 INFO - closeInMemoryFile -> map-output of>　　2016-12-12 15:10:39,644 INFO - localfetcher#1 about to shuffle output of map attempt_local564512176_0001_m_000001_0 decomp: 37233 len: 37237 to MEMORY
　　2016-12-12 15:10:39,646 INFO - Read 37233 bytes from map-output for attempt_local564512176_0001_m_000001_0

　　2016-12-12 15:10:39,647 INFO - closeInMemoryFile -> map-output of>　　2016-12-12 15:10:39,652 INFO - localfetcher#1 about to shuffle output of map attempt_local564512176_0001_m_000000_0 decomp: 37343 len: 37347 to MEMORY
　　2016-12-12 15:10:39,653 INFO - Read 37343 bytes from map-output for attempt_local564512176_0001_m_000000_0

　　2016-12-12 15:10:39,654 INFO - closeInMemoryFile -> map-output of>　　2016-12-12 15:10:39,658 INFO - localfetcher#1 about to shuffle output of map attempt_local564512176_0001_m_000003_0 decomp: 2 len: 6 to MEMORY
　　2016-12-12 15:10:39,659 INFO - Read 2 bytes from map-output for attempt_local564512176_0001_m_000003_0

　　2016-12-12 15:10:39,660 INFO - closeInMemoryFile -> map-output of>　　2016-12-12 15:10:39,660 INFO - EventFetcher is interrupted.. Returning
　　2016-12-12 15:10:39,661 INFO - 4 / 4 copied.
　　2016-12-12 15:10:39,662 INFO - finalMerge called with 4 in-memory map-outputs and 0 on-disk map-outputs
　　2016-12-12 15:10:39,673 INFO - Merging 4 sorted segments

　　2016-12-12 15:10:39,674 INFO - Down to the last merge-pass, with 3 segments left of total>　　2016-12-12 15:10:39,678 INFO -map 100% reduce 0%
　　2016-12-12 15:10:39,780 INFO - Merged 4 segments, 112346 bytes to disk to satisfy reduce memory limit
　　2016-12-12 15:10:39,781 INFO - Merging 1 files, 112344 bytes from disk
　　2016-12-12 15:10:39,783 INFO - Merging 0 segments, 0 bytes from memory into reduce
　　2016-12-12 15:10:39,784 INFO - Merging 1 sorted segments

　　2016-12-12 15:10:39,785 INFO - Down to the last merge-pass, with 1 segments left of total>　　2016-12-12 15:10:39,785 INFO - 4 / 4 copied.
　　2016-12-12 15:10:39,792 INFO - mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
　　2016-12-12 15:10:40,343 INFO - Task:attempt_local564512176_0001_r_000000_0 is done. And is in the process of committing
　　2016-12-12 15:10:40,346 INFO - 4 / 4 copied.
　　2016-12-12 15:10:40,346 INFO - Task attempt_local564512176_0001_r_000000_0 is allowed to commit now
　　2016-12-12 15:10:40,353 INFO - Saved output of task 'attempt_local564512176_0001_r_000000_0' to file:/D:/Code/MyEclipseJavaCode/myMapReduce/out/weibo2/_temporary/0/task_local564512176_0001_r_000000
　　2016-12-12 15:10:40,363 INFO - reduce > reduce
　　2016-12-12 15:10:40,364 INFO - Task 'attempt_local564512176_0001_r_000000_0' done.
　　2016-12-12 15:10:40,364 INFO - Finishing task: attempt_local564512176_0001_r_000000_0
　　2016-12-12 15:10:40,364 INFO - reduce task executor complete.
　　2016-12-12 15:10:40,678 INFO -map 100% reduce 100%
　　2016-12-12 15:10:40,678 INFO - Job job_local564512176_0001 completed successfully
　　2016-12-12 15:10:40,701 INFO - Counters: 33
　　File System Counters
　　FILE: Number of bytes read=2579152
　　FILE: Number of bytes written=1581170
　　FILE: Number of read operations=0
　　FILE: Number of large read operations=0
　　FILE: Number of write operations=0
　　Map-Reduce Framework
　　Map input records=23098
　　Map output records=23097
　　Map output bytes=234473
　　Map output materialized bytes=112362
　　Input split bytes=528
　　Combine input records=23097
　　Combine output records=8774
　　Reduce input groups=5567
　　Reduce shuffle bytes=112362
　　Reduce input records=8774
　　Reduce output records=5567
　　Spilled Records=17548
　　Shuffled Maps =4
　　Failed Shuffles=0
　　Merged Map outputs=4
　　GC time elapsed (ms)=48
　　CPU time spent (ms)=0
　　Physical memory (bytes) snapshot=0
　　Virtual memory (bytes) snapshot=0
　　Total committed heap usage (bytes)=2114977792
　　Shuffle Errors
　　BAD_ID=0
　　CONNECTION=0
　　IO_ERROR=0
　　WRONG_LENGTH=0
　　WRONG_MAP=0
　　WRONG_REDUCE=0
　　File Input Format Counters
　　Bytes Read=585564
　　File Output Format Counters
　　Bytes Written=50762
　　执行job成功

　　执行
　　2016-12-12 15:12:33,225 INFO - Initializing JVM Metrics with processName=JobTracker, sessionId=
　　2016-12-12 15:12:33,823 WARN - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.

　　2016-12-12 15:12:33,824 WARN - No job jar file set.User>　　2016-12-12 15:12:34,364 INFO - Total input paths to process : 4
　　2016-12-12 15:12:34,410 INFO - number of splits:4
　　2016-12-12 15:12:34,729 INFO - Submitting tokens for job: job_local671371338_0001
　　2016-12-12 15:12:35,471 INFO - Creating symlink: \tmp\hadoop-Administrator\mapred\local\1481526755080\part-r-00003 <- D:\Code\MyEclipseJavaCode\myMapReduce/part-r-00003
　　2016-12-12 15:12:35,516 INFO - Localized file:/D:/Code/MyEclipseJavaCode/myMapReduce/out/weibo1/part-r-00003 as file:/tmp/hadoop-Administrator/mapred/local/1481526755080/part-r-00003
　　2016-12-12 15:12:35,521 INFO - Creating symlink: \tmp\hadoop-Administrator\mapred\local\1481526755081\part-r-00000 <- D:\Code\MyEclipseJavaCode\myMapReduce/part-r-00000
　　2016-12-12 15:12:35,544 INFO - Localized file:/D:/Code/MyEclipseJavaCode/myMapReduce/out/weibo2/part-r-00000 as file:/tmp/hadoop-Administrator/mapred/local/1481526755081/part-r-00000
　　2016-12-12 15:12:35,696 INFO - The url to track the job: http://localhost:8080/
　　2016-12-12 15:12:35,697 INFO - Running job: job_local671371338_0001
　　2016-12-12 15:12:35,703 INFO - OutputCommitter set in config null
　　2016-12-12 15:12:35,715 INFO - OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
　　2016-12-12 15:12:35,772 INFO - Waiting for map tasks
　　2016-12-12 15:12:35,772 INFO - Starting task: attempt_local671371338_0001_m_000000_0
　　2016-12-12 15:12:35,819 INFO - ProcfsBasedProcessTree currently is supported only on Linux.
　　2016-12-12 15:12:35,852 INFO -Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@50b97c8b
　　2016-12-12 15:12:35,858 INFO - Processing split: file:/D:/Code/MyEclipseJavaCode/myMapReduce/out/weibo1/part-r-00001:0+195718
　　2016-12-12 15:12:35,926 INFO - (EQUATOR) 0 kvi 26214396(104857584)
　　2016-12-12 15:12:35,926 INFO - mapreduce.task.io.sort.mb: 100
　　2016-12-12 15:12:35,926 INFO - soft limit at 83886080
　　2016-12-12 15:12:35,926 INFO - bufstart = 0; bufvoid = 104857600
　　2016-12-12 15:12:35,927 INFO - kvstart = 26214396; length = 6553600

　　2016-12-12 15:12:35,938 INFO - Map output collector>　　******************
　　2016-12-12 15:12:36,701 INFO - Job job_local671371338_0001 running in uber mode : false
　　2016-12-12 15:12:36,703 INFO -map 0% reduce 0%
　　2016-12-12 15:12:36,965 INFO -
　　2016-12-12 15:12:36,966 INFO - Starting flush of map output
　　2016-12-12 15:12:36,966 INFO - Spilling map output
　　2016-12-12 15:12:36,966 INFO - bufstart = 0; bufend = 239755; bufvoid = 104857600
　　2016-12-12 15:12:36,966 INFO - kvstart = 26214396(104857584); kvend = 26183268(104733072); length = 31129/6553600
　　2016-12-12 15:12:37,135 INFO - Finished spill 0
　　2016-12-12 15:12:37,141 INFO - Task:attempt_local671371338_0001_m_000000_0 is done. And is in the process of committing
　　2016-12-12 15:12:37,153 INFO - map
　　2016-12-12 15:12:37,153 INFO - Task 'attempt_local671371338_0001_m_000000_0' done.
　　2016-12-12 15:12:37,154 INFO - Finishing task: attempt_local671371338_0001_m_000000_0
　　2016-12-12 15:12:37,154 INFO - Starting task: attempt_local671371338_0001_m_000001_0
　　2016-12-12 15:12:37,156 INFO - ProcfsBasedProcessTree currently is supported only on Linux.
　　2016-12-12 15:12:37,191 INFO -Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@70849e34
　　2016-12-12 15:12:37,194 INFO - Processing split: file:/D:/Code/MyEclipseJavaCode/myMapReduce/out/weibo1/part-r-00002:0+193443
　　2016-12-12 15:12:37,229 INFO - (EQUATOR) 0 kvi 26214396(104857584)
　　2016-12-12 15:12:37,229 INFO - mapreduce.task.io.sort.mb: 100
　　2016-12-12 15:12:37,229 INFO - soft limit at 83886080
　　2016-12-12 15:12:37,230 INFO - bufstart = 0; bufvoid = 104857600
　　2016-12-12 15:12:37,230 INFO - kvstart = 26214396; length = 6553600

　　2016-12-12 15:12:37,230 INFO - Map output collector>　　******************
　　2016-12-12 15:12:37,601 INFO -
　　2016-12-12 15:12:37,602 INFO - Starting flush of map output
　　2016-12-12 15:12:37,602 INFO - Spilling map output
　　2016-12-12 15:12:37,602 INFO - bufstart = 0; bufend = 237126; bufvoid = 104857600
　　2016-12-12 15:12:37,602 INFO - kvstart = 26214396(104857584); kvend = 26183624(104734496); length = 30773/6553600
　　2016-12-12 15:12:37,651 INFO - Finished spill 0
　　2016-12-12 15:12:37,683 INFO - Task:attempt_local671371338_0001_m_000001_0 is done. And is in the process of committing
　　2016-12-12 15:12:37,687 INFO - map
　　2016-12-12 15:12:37,687 INFO - Task 'attempt_local671371338_0001_m_000001_0' done.
　　2016-12-12 15:12:37,687 INFO - Finishing task: attempt_local671371338_0001_m_000001_0
　　2016-12-12 15:12:37,687 INFO - Starting task: attempt_local671371338_0001_m_000002_0
　　2016-12-12 15:12:37,690 INFO - ProcfsBasedProcessTree currently is supported only on Linux.
　　2016-12-12 15:12:37,722 INFO -map 100% reduce 0%
　　2016-12-12 15:12:37,810 INFO -Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@544b0d4c
　　2016-12-12 15:12:37,813 INFO - Processing split: file:/D:/Code/MyEclipseJavaCode/myMapReduce/out/weibo1/part-r-00000:0+191780
　　2016-12-12 15:12:37,851 INFO - (EQUATOR) 0 kvi 26214396(104857584)
　　2016-12-12 15:12:37,851 INFO - mapreduce.task.io.sort.mb: 100
　　2016-12-12 15:12:37,851 INFO - soft limit at 83886080
　　2016-12-12 15:12:37,851 INFO - bufstart = 0; bufvoid = 104857600
　　2016-12-12 15:12:37,852 INFO - kvstart = 26214396; length = 6553600

　　2016-12-12 15:12:37,853 INFO - Map output collector>　　******************
　　2016-12-12 15:12:37,915 INFO -
　　2016-12-12 15:12:37,915 INFO - Starting flush of map output
　　2016-12-12 15:12:37,916 INFO - Spilling map output
　　2016-12-12 15:12:37,916 INFO - bufstart = 0; bufend = 234731; bufvoid = 104857600
　　2016-12-12 15:12:37,916 INFO - kvstart = 26214396(104857584); kvend = 26183920(104735680); length = 30477/6553600
　　2016-12-12 15:12:37,939 INFO - Finished spill 0
　　2016-12-12 15:12:37,943 INFO - Task:attempt_local671371338_0001_m_000002_0 is done. And is in the process of committing
　　2016-12-12 15:12:37,946 INFO - map
　　2016-12-12 15:12:37,946 INFO - Task 'attempt_local671371338_0001_m_000002_0' done.
　　2016-12-12 15:12:37,946 INFO - Finishing task: attempt_local671371338_0001_m_000002_0
　　2016-12-12 15:12:37,947 INFO - Starting task: attempt_local671371338_0001_m_000003_0
　　2016-12-12 15:12:37,950 INFO - ProcfsBasedProcessTree currently is supported only on Linux.
　　2016-12-12 15:12:37,999 INFO -Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@6c241f31
　　2016-12-12 15:12:38,002 INFO - Processing split: file:/D:/Code/MyEclipseJavaCode/myMapReduce/out/weibo1/part-r-00003:0+11
　　2016-12-12 15:12:38,046 INFO - (EQUATOR) 0 kvi 26214396(104857584)
　　2016-12-12 15:12:38,046 INFO - mapreduce.task.io.sort.mb: 100
　　2016-12-12 15:12:38,046 INFO - soft limit at 83886080
　　2016-12-12 15:12:38,046 INFO - bufstart = 0; bufvoid = 104857600
　　2016-12-12 15:12:38,046 INFO - kvstart = 26214396; length = 6553600

　　2016-12-12 15:12:38,047 INFO - Map output collector>　　******************
　　2016-12-12 15:12:38,050 INFO -
　　2016-12-12 15:12:38,050 INFO - Starting flush of map output
　　2016-12-12 15:12:38,060 INFO - Task:attempt_local671371338_0001_m_000003_0 is done. And is in the process of committing
　　2016-12-12 15:12:38,063 INFO - map
　　2016-12-12 15:12:38,063 INFO - Task 'attempt_local671371338_0001_m_000003_0' done.
　　2016-12-12 15:12:38,064 INFO - Finishing task: attempt_local671371338_0001_m_000003_0
　　2016-12-12 15:12:38,064 INFO - map task executor complete.
　　2016-12-12 15:12:38,067 INFO - Waiting for reduce tasks
　　2016-12-12 15:12:38,067 INFO - Starting task: attempt_local671371338_0001_r_000000_0
　　2016-12-12 15:12:38,079 INFO - ProcfsBasedProcessTree currently is supported only on Linux.
　　2016-12-12 15:12:38,104 INFO -Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@777da320
　　2016-12-12 15:12:38,116 INFO - Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@76a01b4b
　　2016-12-12 15:12:38,133 INFO - MergerManager: memoryLimit=1327077760, maxSingleShuffleLimit=331769440, mergeThreshold=875871360, ioSortFactor=10, memToMemMergeOutputsThreshold=10
　　2016-12-12 15:12:38,135 INFO - attempt_local671371338_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
　　2016-12-12 15:12:38,165 INFO - localfetcher#1 about to shuffle output of map attempt_local671371338_0001_m_000001_0 decomp: 252516 len: 252520 to MEMORY
　　2016-12-12 15:12:38,169 INFO - Read 252516 bytes from map-output for attempt_local671371338_0001_m_000001_0

　　2016-12-12 15:12:38,216 INFO - closeInMemoryFile -> map-output of>　　2016-12-12 15:12:38,221 INFO - localfetcher#1 about to shuffle output of map attempt_local671371338_0001_m_000002_0 decomp: 249973 len: 249977 to MEMORY
　　2016-12-12 15:12:38,223 INFO - Read 249973 bytes from map-output for attempt_local671371338_0001_m_000002_0

　　2016-12-12 15:12:38,224 INFO - closeInMemoryFile -> map-output of>　　2016-12-12 15:12:38,230 INFO - localfetcher#1 about to shuffle output of map attempt_local671371338_0001_m_000000_0 decomp: 255323 len: 255327 to MEMORY
　　2016-12-12 15:12:38,233 INFO - Read 255323 bytes from map-output for attempt_local671371338_0001_m_000000_0

　　2016-12-12 15:12:38,233 INFO - closeInMemoryFile -> map-output of>　　2016-12-12 15:12:38,235 INFO - localfetcher#1 about to shuffle output of map attempt_local671371338_0001_m_000003_0 decomp: 2 len: 6 to MEMORY
　　2016-12-12 15:12:38,236 INFO - Read 2 bytes from map-output for attempt_local671371338_0001_m_000003_0

　　2016-12-12 15:12:38,236 INFO - closeInMemoryFile -> map-output of>　　2016-12-12 15:12:38,237 INFO - EventFetcher is interrupted.. Returning
　　2016-12-12 15:12:38,238 INFO - 4 / 4 copied.
　　2016-12-12 15:12:38,238 INFO - finalMerge called with 4 in-memory map-outputs and 0 on-disk map-outputs
　　2016-12-12 15:12:38,252 INFO - Merging 4 sorted segments

　　2016-12-12 15:12:38,253 INFO - Down to the last merge-pass, with 3 segments left of total>　　2016-12-12 15:12:38,413 INFO - Merged 4 segments, 757814 bytes to disk to satisfy reduce memory limit
　　2016-12-12 15:12:38,414 INFO - Merging 1 files, 757812 bytes from disk
　　2016-12-12 15:12:38,415 INFO - Merging 0 segments, 0 bytes from memory into reduce
　　2016-12-12 15:12:38,415 INFO - Merging 1 sorted segments

　　2016-12-12 15:12:38,416 INFO - Down to the last merge-pass, with 1 segments left of total>　　2016-12-12 15:12:38,433 INFO - 4 / 4 copied.
　　2016-12-12 15:12:38,439 INFO - mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
　　2016-12-12 15:12:38,844 INFO - Task:attempt_local671371338_0001_r_000000_0 is done. And is in the process of committing
　　2016-12-12 15:12:38,846 INFO - 4 / 4 copied.
　　2016-12-12 15:12:38,846 INFO - Task attempt_local671371338_0001_r_000000_0 is allowed to commit now
　　2016-12-12 15:12:38,857 INFO - Saved output of task 'attempt_local671371338_0001_r_000000_0' to file:/D:/Code/MyEclipseJavaCode/myMapReduce/out/weibo3/_temporary/0/task_local671371338_0001_r_000000
　　2016-12-12 15:12:38,861 INFO - reduce > reduce
　　2016-12-12 15:12:38,861 INFO - Task 'attempt_local671371338_0001_r_000000_0' done.
　　2016-12-12 15:12:38,861 INFO - Finishing task: attempt_local671371338_0001_r_000000_0
　　2016-12-12 15:12:38,862 INFO - reduce task executor complete.
　　2016-12-12 15:12:39,724 INFO -map 100% reduce 100%
　　2016-12-12 15:12:39,726 INFO - Job job_local671371338_0001 completed successfully
　　2016-12-12 15:12:39,841 INFO - Counters: 33
　　File System Counters
　　FILE: Number of bytes read=4124093
　　FILE: Number of bytes written=5365498
　　FILE: Number of read operations=0
　　FILE: Number of large read operations=0
　　FILE: Number of write operations=0
　　Map-Reduce Framework
　　Map input records=23098
　　Map output records=23097
　　Map output bytes=711612
　　Map output materialized bytes=757830
　　Input split bytes=528
　　Combine input records=0
　　Combine output records=0
　　Reduce input groups=1065
　　Reduce shuffle bytes=757830
　　Reduce input records=23097
　　Reduce output records=1065
　　Spilled Records=46194
　　Shuffled Maps =4
　　Failed Shuffles=0
　　Merged Map outputs=4
　　GC time elapsed (ms)=30
　　CPU time spent (ms)=0
　　Physical memory (bytes) snapshot=0
　　Virtual memory (bytes) snapshot=0
　　Total committed heap usage (bytes)=2353528832
　　Shuffle Errors
　　BAD_ID=0
　　CONNECTION=0
　　IO_ERROR=0
　　WRONG_LENGTH=0
　　WRONG_MAP=0
　　WRONG_REDUCE=0
　　File Input Format Counters
　　Bytes Read=585564
　　File Output Format Counters
　　Bytes Written=340785
　　执行job成功

　　代码
　　

1 package zhouls.bigdata.myMapReduce.weibo;　　

2　　
3 import java.io.IOException;
　　
4
　　
5 import java.io.StringReader;
　　
6
　　
7 import org.apache.hadoop.io.IntWritable;
　　
8 import org.apache.hadoop.io.LongWritable;
　　
9 import org.apache.hadoop.io.Text;
　　
10 import org.apache.hadoop.mapreduce.Mapper;
　　
11 import org.wltea.analyzer.core.IKSegmenter;
　　
12 import org.wltea.analyzer.core.Lexeme;
　　
13
　　
14 /**
　　
15* 第一个MR，计算TF和计算N(微博总数)
　　
16* @author root
　　
17*
　　
18*/

　　
19 public>　　
20
　　
21 protected void map(LongWritable key, Text value,
　　
22          Context context)
　　
23          throws IOException, InterruptedException {
　　
24 //    3823890201582094 今天我约了豆浆，油条。约了电饭煲几小时后饭就自动煮好，还想约豆浆机，让我早晨多睡一小时，豆浆就自然好。起床就可以喝上香喷喷的豆浆了。
　　
25 //    3823890210294392 今天我约了豆浆，油条
　　
26       String[]v =value.toString().trim().split("\t");
　　
27       if(v.length>=2){

　　
28       String>　　
29       String content =v.trim();
　　
30
　　
31       StringReader sr =new StringReader(content);//content是新浪微博内容
　　
32       IKSegmenter ikSegmenter =new IKSegmenter(sr, true);
　　
33       Lexeme word=null;
　　
34       //第一件事情，就是通过IK分词器（IKAnalyzer），把weibo2.txt里的内容
　　
35       //这里，单独可以去网上找到IKAnalyzer2012_FF.jar。然后像我这样，放到lib下，必须要选中，然后Build Path-> Add Build Path
　　
36
　　
37       while( (word=ikSegmenter.next()) !=null ){
　　
38          String w= word.getLexemeText();//w是词条
　　
39          context.write(new Text(w+"_"+id), new IntWritable(1));
　　
40       }
　　
41       context.write(new Text("count"), new IntWritable(1));
　　
42       }else{
　　
43          System.out.println(value.toString()+"-------------");//为什么要来----------，是因为方便统计TF，因为TF是某一篇微博词条的词频。
　　
44       }
　　
45 }
　　
46
　　
47
　　
48
　　
49 }
　　

　　

1 package zhouls.bigdata.myMapReduce.weibo;　　

2　　
3 import org.apache.hadoop.io.IntWritable;
　　
4 import org.apache.hadoop.io.LongWritable;
　　
5 import org.apache.hadoop.io.Text;
　　
6 import org.apache.hadoop.mapreduce.Partitioner;
　　
7 import org.apache.hadoop.mapreduce.lib.partition.HashPartitioner;
　　
8
　　
9 /**
　　
10* 第一个MR自定义分区
　　
11* @author root
　　
12*
　　
13*/

　　
14 public>　　
15
　　
16
　　
17 public int getPartition(Text key, IntWritable value, int reduceCount) {
　　
18       if(key.equals(new Text("count")))
　　
19          return 3;//总共拿4个reduce，其中拿1个reduce去输出微博总数，拿3个reduce去输出微博词频。
　　
20       else
　　
21          return super.getPartition(key, value, reduceCount-1);
　　
22 }
　　
23
　　
24 }
　　

　　

1 package zhouls.bigdata.myMapReduce.weibo;　　

2　　
3 import java.io.IOException;
　　
4
　　
5 import org.apache.hadoop.io.IntWritable;
　　
6 import org.apache.hadoop.io.LongWritable;
　　
7 import org.apache.hadoop.io.Text;
　　
8 import org.apache.hadoop.mapreduce.Reducer;
　　
9 /**
　　
10* c1_001,2
　　
11* c2_001,1
　　
12* count,10000
　　
13* @author root
　　
14*
　　
15*/

　　
16 public>　　
17
　　
18 protected void reduce(Text arg0, Iterable<IntWritable> arg1,
　　
19          Context arg2)
　　
20          throws IOException, InterruptedException {
　　
21
　　
22       int sum =0;
　　
23       for( IntWritable i :arg1 ){
　　
24          sum= sum+i.get();
　　
25       }
　　
26       if(arg0.equals(new Text("count"))){
　　
27          System.out.println(arg0.toString() +"___________"+sum);
　　
28       }
　　
29       arg2.write(arg0, new IntWritable(sum));
　　
30 }
　　
31
　　
32 }
　　

　　

1 package zhouls.bigdata.myMapReduce.weibo;　　

2　　
3
　　
4 import java.io.IOException;
　　
5
　　
6 import org.apache.hadoop.conf.Configuration;
　　
7 import org.apache.hadoop.fs.FileSystem;
　　
8 import org.apache.hadoop.fs.Path;
　　
9 import org.apache.hadoop.io.IntWritable;
　　
10 import org.apache.hadoop.io.LongWritable;
　　
11 import org.apache.hadoop.io.Text;
　　
12 import org.apache.hadoop.mapred.JobConf;
　　
13 import org.apache.hadoop.mapred.TextInputFormat;
　　
14 import org.apache.hadoop.mapreduce.Job;
　　
15 import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
　　
16 import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
　　
17
　　
18

　　
19 public>　　
20
　　
21 public static void main(String[] args) {
　　
22       Configuration config =new Configuration();
　　
23 //    config.set("fs.defaultFS", "hdfs://HadoopMaster:9000");
　　
24 //    config.set("yarn.resourcemanager.hostname", "HadoopMaster");
　　
25       try {
　　
26          FileSystem fs =FileSystem.get(config);
　　
27 //          JobConf job =new JobConf(config);
　　
28          Job job =Job.getInstance(config);
　　
29          job.setJarByClass(FirstJob.class);
　　
30          job.setJobName("weibo1");
　　
31
　　
32          job.setOutputKeyClass(Text.class);
　　
33          job.setOutputValueClass(IntWritable.class);
　　
34 //          job.setMapperClass();
　　
35          job.setNumReduceTasks(4);
　　
36          job.setPartitionerClass(FirstPartition.class);
　　
37          job.setMapperClass(FirstMapper.class);
　　
38          job.setCombinerClass(FirstReduce.class);
　　
39          job.setReducerClass(FirstReduce.class);
　　
40
　　
41 //
　　
42 //          FileInputFormat.addInputPath(job, new Path("hdfs://HadoopMaster:9000/Weibodata.txt"));//下有数据源，Weibodata.txt
　　
43 //
　　
44 //          Path path =new Path("hdfs://HadoopMaster:9000/out/weibo1");
　　
45
　　
46
　　
47          FileInputFormat.addInputPath(job, new Path("./data/weibo/Weibodata.txt"));//下有数据源，Weibodata.txt
　　
48
　　
49          Path path =new Path("./out/weibo1");
　　
50
　　
51
　　
52
　　
53 //          part-r-00000
　　
54 //          part-r-00001
　　
55 //          part-r-00002 拿3个reduce去输出微博词频。
　　
56 //          part-r-00003 最后这个是输出微博总数，
　　
57          if(fs.exists(path)){
　　
58             fs.delete(path, true);
　　
59          }
　　
60          FileOutputFormat.setOutputPath(job,path);
　　
61
　　
62          boolean f= job.waitForCompletion(true);
　　
63          if(f){
　　
64
　　
65          }
　　
66       } catch (Exception e) {
　　
67          e.printStackTrace();
　　
68       }
　　
69 }
　　
70 }
　　

　　

1 package zhouls.bigdata.myMapReduce.weibo;　　

2　　
3 import java.io.IOException;
　　
4
　　
5 import java.io.StringReader;
　　
6
　　
7 import org.apache.hadoop.io.IntWritable;
　　
8 import org.apache.hadoop.io.LongWritable;
　　
9 import org.apache.hadoop.io.Text;
　　
10 import org.apache.hadoop.mapreduce.Mapper;
　　
11 import org.apache.hadoop.mapreduce.lib.input.FileSplit;
　　
12 import org.wltea.analyzer.core.IKSegmenter;
　　
13 import org.wltea.analyzer.core.Lexeme;
　　
14 //统计df：词在多少个微博中出现过。

　　
15 public>　　
16
　　
17 protected void map(LongWritable key, Text value, Context context)
　　
18          throws IOException, InterruptedException {
　　
19
　　
20       //获取当前 mapper task的数据片段（split）
　　
21       FileSplit fs = (FileSplit) context.getInputSplit();
　　
22
　　
23       if (!fs.getPath().getName().contains("part-r-00003")) {
　　
24
　　
25          String[] v = value.toString().trim().split("\t");
　　
26          if (v.length >= 2) {
　　
27             String[] ss = v.split("_");
　　
28             if (ss.length >= 2) {
　　
29                   String w = ss;
　　
30                   context.write(new Text(w), new IntWritable(1));
　　
31             }
　　
32          } else {
　　
33             System.out.println(value.toString() + "-------------");
　　
34          }
　　
35       }
　　
36
　　
37 }
　　
38 }
　　

　　

1 package zhouls.bigdata.myMapReduce.weibo;　　

2　　
3 import java.io.IOException;
　　
4
　　
5 import org.apache.hadoop.io.IntWritable;
　　
6 import org.apache.hadoop.io.LongWritable;
　　
7 import org.apache.hadoop.io.Text;
　　
8 import org.apache.hadoop.mapreduce.Reducer;
　　
9

　　
10 public>　　
11
　　
12 protected void reduce(Text key, Iterable<IntWritable> arg1,
　　
13          Context context)
　　
14          throws IOException, InterruptedException {
　　
15
　　
16       int sum =0;
　　
17       for( IntWritable i :arg1 ){
　　
18          sum= sum+i.get();
　　
19       }
　　
20
　　
21       context.write(key, new IntWritable(sum));
　　
22 }
　　
23
　　
24 }
　　

　　

1 package zhouls.bigdata.myMapReduce.weibo;　　

2　　
3 import java.io.IOException;
　　
4
　　
5 import org.apache.hadoop.conf.Configuration;
　　
6 import org.apache.hadoop.fs.Path;
　　
7 import org.apache.hadoop.io.IntWritable;
　　
8 import org.apache.hadoop.io.LongWritable;
　　
9 import org.apache.hadoop.io.Text;
　　
10 import org.apache.hadoop.mapred.JobConf;
　　
11 import org.apache.hadoop.mapred.TextInputFormat;
　　
12 import org.apache.hadoop.mapreduce.Job;
　　
13 import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
　　
14 import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
　　
15
　　
16

　　
17 public>　　
18
　　
19 public static void main(String[] args) {
　　
20       Configuration config =new Configuration();
　　
21 //    config.set("fs.defaultFS", "hdfs://HadoopMaster:9000");
　　
22 //    config.set("yarn.resourcemanager.hostname", "HadoopMaster");
　　
23       try {
　　
24 //          JobConf job =new JobConf(config);
　　
25          Job job =Job.getInstance(config);
　　
26          job.setJarByClass(TwoJob.class);
　　
27          job.setJobName("weibo2");
　　
28          //设置map任务的输出key类型、value类型
　　
29          job.setOutputKeyClass(Text.class);
　　
30          job.setOutputValueClass(IntWritable.class);
　　
31 //          job.setMapperClass();
　　
32          job.setMapperClass(TwoMapper.class);
　　
33          job.setCombinerClass(TwoReduce.class);
　　
34          job.setReducerClass(TwoReduce.class);
　　
35
　　
36          //mr运行时的输入数据从hdfs的哪个目录中获取
　　
37 //          FileInputFormat.addInputPath(job, new Path("hdfs://HadoopMaster:9000/out/weibo1/"));
　　
38 //          FileOutputFormat.setOutputPath(job, new Path("hdfs://HadoopMaster:9000/out/weibo2"));
　　
39
　　
40          FileInputFormat.addInputPath(job, new Path("./out/weibo1/"));
　　
41          FileOutputFormat.setOutputPath(job, new Path("./out/weibo2"));
　　
42
　　
43          boolean f= job.waitForCompletion(true);
　　
44          if(f){
　　
45             System.out.println("执行job成功");
　　
46          }
　　
47       } catch (Exception e) {
　　
48          e.printStackTrace();
　　
49       }
　　
50 }
　　
51 }
　　

　　

1 package zhouls.bigdata.myMapReduce.weibo;　　

2　　
3 import java.io.BufferedReader;
　　
4
　　
5 import java.io.File;
　　
6 import java.io.FileInputStream;
　　
7 import java.io.FileReader;
　　
8 import java.io.IOException;
　　
9 import java.io.InputStreamReader;
　　
10 import java.io.StringReader;
　　
11 import java.net.URI;
　　
12 import java.text.NumberFormat;
　　
13 import java.util.HashMap;
　　
14 import java.util.Map;
　　
15
　　
16 import org.apache.hadoop.conf.Configuration;
　　
17 import org.apache.hadoop.fs.FileSystem;
　　
18 import org.apache.hadoop.fs.Path;
　　
19 import org.apache.hadoop.io.IntWritable;
　　
20 import org.apache.hadoop.io.LongWritable;
　　
21 import org.apache.hadoop.io.Text;
　　
22 import org.apache.hadoop.mapreduce.Mapper;
　　
23 import org.apache.hadoop.mapreduce.lib.input.FileSplit;
　　
24 import org.wltea.analyzer.core.IKSegmenter;
　　
25 import org.wltea.analyzer.core.Lexeme;
　　
26
　　
27 /**
　　
28* 最后计算
　　
29* @author root
　　
30*
　　
31*/

　　
32 public>　　
33 //存放微博总数
　　
34 public static Map<String, Integer> cmap = null;
　　
35 //存放df
　　
36 public static Map<String, Integer> df = null;
　　
37
　　
38 // 在map方法执行之前
　　
39 protected void setup(Context context) throws IOException,
　　
40          InterruptedException {
　　
41       System.out.println("******************");
　　
42       if (cmap == null || cmap.size() == 0 || df == null || df.size() == 0) {
　　
43
　　
44          URI[] ss = context.getCacheFiles();
　　
45          if (ss != null) {
　　
46             for (int i = 0; i < ss.length; i++) {
　　
47                   URI uri = ss;
　　
48                   if (uri.getPath().endsWith("part-r-00003")) {//微博总数
　　
49                      Path path =new Path(uri.getPath());
　　
50 //                      FileSystem fs =FileSystem.get(context.getConfiguration());
　　
51 //                      fs.open(path);
　　
52                      BufferedReader br = new BufferedReader(new FileReader(path.getName()));
　　
53                      String line = br.readLine();
　　
54                      if (line.startsWith("count")) {
　　
55                         String[] ls = line.split("\t");
　　
56                         cmap = new HashMap<String, Integer>();
　　
57                         cmap.put(ls, Integer.parseInt(ls.trim()));
　　
58                      }
　　
59                      br.close();
　　
60                   } else if (uri.getPath().endsWith("part-r-00000")) {//词条的DF
　　
61                      df = new HashMap<String, Integer>();
　　
62                      Path path =new Path(uri.getPath());
　　
63                      BufferedReader br = new BufferedReader(new FileReader(path.getName()));
　　
64                      String line;
　　
65                      while ((line = br.readLine()) != null) {
　　
66                         String[] ls = line.split("\t");
　　
67                         df.put(ls, Integer.parseInt(ls.trim()));
　　
68                      }
　　
69                      br.close();
　　
70                   }
　　
71             }
　　
72          }
　　
73       }
　　
74 }
　　
75
　　
76 protected void map(LongWritable key, Text value, Context context)
　　
77          throws IOException, InterruptedException {
　　
78       FileSplit fs = (FileSplit) context.getInputSplit();
　　
79 //    System.out.println("--------------------");
　　
80       if (!fs.getPath().getName().contains("part-r-00003")) {
　　
81
　　
82          String[] v = value.toString().trim().split("\t");
　　
83          if (v.length >= 2) {
　　
84             int tf =Integer.parseInt(v.trim());//tf值
　　
85             String[] ss = v.split("_");
　　
86             if (ss.length >= 2) {
　　
87                   String w = ss;

　　
88                   String>　　
89
　　
90                   double s=tf * Math.log(cmap.get("count")/df.get(w));
　　
91                   NumberFormat nf =NumberFormat.getInstance();
　　
92                   nf.setMaximumFractionDigits(5);
　　
93                   context.write(new Text(id), new Text(w+":"+nf.format(s)));
　　
94             }
　　
95          } else {
　　
96             System.out.println(value.toString() + "-------------");
　　
97          }
　　
98       }
　　
99 }
　　
100 }
　　

　　

1 package zhouls.bigdata.myMapReduce.weibo;　　

2　　
3 import java.io.IOException;
　　
4
　　
5 import org.apache.hadoop.io.IntWritable;
　　
6 import org.apache.hadoop.io.LongWritable;
　　
7 import org.apache.hadoop.io.Text;
　　
8 import org.apache.hadoop.mapreduce.Reducer;
　　
9

　　
10 public>　　
11
　　
12 protected void reduce(Text key, Iterable<Text> arg1,
　　
13          Context context)
　　
14          throws IOException, InterruptedException {
　　
15
　　
16       StringBuffer sb =new StringBuffer();
　　
17
　　
18       for( Text i :arg1 ){
　　
19          sb.append(i.toString()+"\t");
　　
20       }
　　
21
　　
22       context.write(key, new Text(sb.toString()));
　　
23 }
　　
24
　　
25 }
　　

　　

1 package zhouls.bigdata.myMapReduce.weibo;　　

2　　
3 import java.io.IOException;
　　
4
　　
5 import org.apache.hadoop.conf.Configuration;
　　
6 import org.apache.hadoop.filecache.DistributedCache;
　　
7 import org.apache.hadoop.fs.FileSystem;
　　
8 import org.apache.hadoop.fs.Path;
　　
9 import org.apache.hadoop.io.IntWritable;
　　
10 import org.apache.hadoop.io.LongWritable;
　　
11 import org.apache.hadoop.io.Text;
　　
12 import org.apache.hadoop.mapred.JobConf;
　　
13 import org.apache.hadoop.mapred.TextInputFormat;
　　
14 import org.apache.hadoop.mapreduce.Job;
　　
15 import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
　　
16 import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
　　
17
　　
18

　　
19 public>　　
20 public static void main(String[] args) {
　　
21       Configuration config =new Configuration();
　　
22 //    config.set("fs.defaultFS", "hdfs://HadoopMaster:9000");
　　
23 //    config.set("yarn.resourcemanager.hostname", "HadoopMaster");
　　
24 //    config.set("mapred.jar", "C:\\Users\\Administrator\\Desktop\\weibo3.jar");
　　
25       try {
　　
26          FileSystem fs =FileSystem.get(config);
　　
27 //          JobConf job =new JobConf(config);
　　
28          Job job =Job.getInstance(config);
　　
29          job.setJarByClass(LastJob.class);
　　
30          job.setJobName("weibo3");
　　
31
　　
32 //          DistributedCache.addCacheFile(uri, conf);
　　
33          //2.5
　　
34          //把微博总数加载到内存
　　
35 //          job.addCacheFile(new Path("hdfs://HadoopMaster:9000/out/weibo1/part-r-00003").toUri());
　　
36 //          //把df加载到内存
　　
37 //          job.addCacheFile(new Path("hdfs://HadoopMaster:9000/out/weibo2/part-r-00000").toUri());
　　
38
　　
39
　　
40          job.addCacheFile(new Path("./out/weibo1/part-r-00003").toUri());
　　
41          //把df加载到内存
　　
42          job.addCacheFile(new Path("./out/weibo2/part-r-00000").toUri());
　　
43
　　
44
　　
45
　　
46          //设置map任务的输出key类型、value类型
　　
47          job.setOutputKeyClass(Text.class);
　　
48          job.setOutputValueClass(Text.class);
　　
49 //          job.setMapperClass();
　　
50          job.setMapperClass(LastMapper.class);
　　
51          job.setReducerClass(LastReduce.class);
　　
52
　　
53          //mr运行时的输入数据从hdfs的哪个目录中获取
　　
54 //          FileInputFormat.addInputPath(job, new Path("hdfs://HadoopMaster:9000/out/weibo1"));
　　
55 //          Path outpath =new Path("hdfs://HadoopMaster:9000/out/weibo3/");
　　
56
　　
57          FileInputFormat.addInputPath(job, new Path("./out/weibo1"));
　　
58          Path outpath =new Path("./out/weibo3/");
　　
59
　　
60
　　
61          if(fs.exists(outpath)){
　　
62             fs.delete(outpath, true);
　　
63          }
　　
64          FileOutputFormat.setOutputPath(job,outpath );
　　
65
　　
66          boolean f= job.waitForCompletion(true);
　　
67          if(f){
　　
68             System.out.println("执行job成功");
　　
69          }
　　
70       } catch (Exception e) {
　　
71          e.printStackTrace();
　　
72       }
　　
73 }
　　
74 }
　　

　　

1 package zhouls.bigdata.myMapReduce.weibo;　　

2　　
3 import java.text.NumberFormat;
　　
4

　　
5 public>　　
6
　　
7 public static void main(String[] args) {
　　
8       double s=34 * Math.log(1056/5);
　　
9       NumberFormat nf =NumberFormat.getInstance();
　　
10       nf.setMaximumFractionDigits(5);
　　
11       System.out.println(nf.format(s));
　　
12 }
　　
13 }
　　

页: [1]

运维网's Archiver

Hadoop MapReduce编程 API入门系列之多个Job迭代式MapReduce运行（十二）