设为首页 收藏本站
查看: 625|回复: 0

[经验分享] Homework

[复制链接]

尚未签到

发表于 2016-12-7 11:08:55 | 显示全部楼层 |阅读模式
  In this blog post I introduce some of the benchmarking and testing tools  in the Apache Hadoop distribution. Namely, I'll look at TeraSort, NNBench and MRBench. These are popular choices to benchmark a Hadoop cluster.
  Before we start, let me show you the clusters on which the tests will run:


  • Three VMWare virtual machines (nodes) run on OS X Mountain Lion
  • Node1: two processors, 2GB memory, which is used as NameNode as well as DataNode
  • Node2: 1 processor, 1GB memory, which is used as Secondary NameNode as well as DataNodes
  • Node3: 1 processor, 1GB memory, which is used as DataNode
  Now let's start benchmark test.
  TeraSort benchmark test
  A full TeraSort benchmark run consists of the following three steps:


  • Generating the input data via TeraGen.
  • Running the actual TeraSort on the input data.
  • Validating the sorted output data via TeraValidate.
  Now let's generate the input data with:

[iyunv@n1 lib]# hadoop jar hadoop-examples.jar teragen 1000 /user/root/terasort-input
13/07/12 21:37:00 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
Generating 1000 using 2 maps with step of 500
13/07/12 21:37:09 INFO mapred.JobClient: Running job: job_201307122107_0001
13/07/12 21:37:10 INFO mapred.JobClient:  map 0% reduce 0%
13/07/12 21:37:35 INFO mapred.JobClient:  map 50% reduce 0%
13/07/12 21:38:28 INFO mapred.JobClient:  map 100% reduce 0%
13/07/12 21:39:03 INFO mapred.JobClient: Job complete: job_201307122107_0001
13/07/12 21:39:05 INFO mapred.JobClient: Counters: 24
13/07/12 21:39:06 INFO mapred.JobClient:   File System Counters
13/07/12 21:39:06 INFO mapred.JobClient:     FILE: Number of bytes read=0
13/07/12 21:39:06 INFO mapred.JobClient:     FILE: Number of bytes written=309768
13/07/12 21:39:06 INFO mapred.JobClient:     FILE: Number of read operations=0
13/07/12 21:39:06 INFO mapred.JobClient:     FILE: Number of large read operations=0
13/07/12 21:39:06 INFO mapred.JobClient:     FILE: Number of write operations=0
13/07/12 21:39:06 INFO mapred.JobClient:     HDFS: Number of bytes read=164
13/07/12 21:39:06 INFO mapred.JobClient:     HDFS: Number of bytes written=100000
13/07/12 21:39:06 INFO mapred.JobClient:     HDFS: Number of read operations=3
13/07/12 21:39:06 INFO mapred.JobClient:     HDFS: Number of large read operations=0
13/07/12 21:39:06 INFO mapred.JobClient:     HDFS: Number of write operations=2
13/07/12 21:39:06 INFO mapred.JobClient:   Job Counters
13/07/12 21:39:06 INFO mapred.JobClient:     Launched map tasks=2
13/07/12 21:39:06 INFO mapred.JobClient:     Total time spent by all maps in occupied slots (ms)=93872
13/07/12 21:39:06 INFO mapred.JobClient:     Total time spent by all reduces in occupied slots (ms)=0
13/07/12 21:39:06 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/07/12 21:39:06 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/07/12 21:39:06 INFO mapred.JobClient:   Map-Reduce Framework
13/07/12 21:39:06 INFO mapred.JobClient:     Map input records=1000
13/07/12 21:39:06 INFO mapred.JobClient:     Map output records=1000
13/07/12 21:39:06 INFO mapred.JobClient:     Input split bytes=164
13/07/12 21:39:06 INFO mapred.JobClient:     Spilled Records=0
13/07/12 21:39:06 INFO mapred.JobClient:     CPU time spent (ms)=1360
13/07/12 21:39:06 INFO mapred.JobClient:     Physical memory (bytes) snapshot=178167808
13/07/12 21:39:06 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=2249502720
13/07/12 21:39:06 INFO mapred.JobClient:     Total committed heap usage (bytes)=48758784
13/07/12 21:39:06 INFO mapred.JobClient:   org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter
13/07/12 21:39:06 INFO mapred.JobClient:     BYTES_READ=1000
  Check the data generated:

[iyunv@n1 lib]# hadoop fs -ls ./terasort-input
Found 4 items
-rw-r--r--   3 root supergroup          0 2013-07-12 21:38 terasort-input/_SUCCESS
drwxr-xr-x   - root supergroup          0 2013-07-12 21:37 terasort-input/_logs
-rw-r--r--   3 root supergroup      50000 2013-07-12 21:37 terasort-input/part-00000
-rw-r--r--   3 root supergroup      50000 2013-07-12 21:38 terasort-input/part-00001
  Run the terasort test:

[iyunv@n1 lib]# hadoop jar hadoop-examples.jar terasort terasort-input terasort-output
13/07/12 21:53:19 INFO terasort.TeraSort: starting
13/07/12 21:53:21 INFO mapred.FileInputFormat: Total input paths to process : 2
13/07/12 21:53:21 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
13/07/12 21:53:21 INFO compress.CodecPool: Got brand-new compressor [.deflate]
Making 1 from 1000 records
Step size is 1000.0
13/07/12 21:53:22 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/07/12 21:53:26 INFO mapred.JobClient: Running job: job_201307122107_0002
13/07/12 21:53:27 INFO mapred.JobClient:  map 0% reduce 0%
13/07/12 21:53:46 INFO mapred.JobClient:  map 100% reduce 0%
13/07/12 21:53:57 INFO mapred.JobClient:  map 100% reduce 100%
13/07/12 21:54:01 INFO mapred.JobClient: Job complete: job_201307122107_0002
13/07/12 21:54:01 INFO mapred.JobClient: Counters: 33
13/07/12 21:54:01 INFO mapred.JobClient:   File System Counters
13/07/12 21:54:01 INFO mapred.JobClient:     FILE: Number of bytes read=23088
13/07/12 21:54:01 INFO mapred.JobClient:     FILE: Number of bytes written=520103
13/07/12 21:54:01 INFO mapred.JobClient:     FILE: Number of read operations=0
13/07/12 21:54:01 INFO mapred.JobClient:     FILE: Number of large read operations=0
13/07/12 21:54:01 INFO mapred.JobClient:     FILE: Number of write operations=0
13/07/12 21:54:01 INFO mapred.JobClient:     HDFS: Number of bytes read=100230
13/07/12 21:54:01 INFO mapred.JobClient:     HDFS: Number of bytes written=100000
13/07/12 21:54:01 INFO mapred.JobClient:     HDFS: Number of read operations=4
13/07/12 21:54:01 INFO mapred.JobClient:     HDFS: Number of large read operations=0
13/07/12 21:54:01 INFO mapred.JobClient:     HDFS: Number of write operations=1
13/07/12 21:54:01 INFO mapred.JobClient:   Job Counters
13/07/12 21:54:01 INFO mapred.JobClient:     Launched map tasks=2
13/07/12 21:54:01 INFO mapred.JobClient:     Launched reduce tasks=1
13/07/12 21:54:01 INFO mapred.JobClient:     Data-local map tasks=2
13/07/12 21:54:01 INFO mapred.JobClient:     Total time spent by all maps in occupied slots (ms)=26310
13/07/12 21:54:01 INFO mapred.JobClient:     Total time spent by all reduces in occupied slots (ms)=8722
13/07/12 21:54:01 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/07/12 21:54:01 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/07/12 21:54:01 INFO mapred.JobClient:   Map-Reduce Framework
13/07/12 21:54:01 INFO mapred.JobClient:     Map input records=1000
13/07/12 21:54:01 INFO mapred.JobClient:     Map output records=1000
13/07/12 21:54:01 INFO mapred.JobClient:     Map output bytes=100000
13/07/12 21:54:01 INFO mapred.JobClient:     Input split bytes=230
13/07/12 21:54:01 INFO mapred.JobClient:     Combine input records=0
13/07/12 21:54:01 INFO mapred.JobClient:     Combine output records=0
13/07/12 21:54:01 INFO mapred.JobClient:     Reduce input groups=1000
13/07/12 21:54:01 INFO mapred.JobClient:     Reduce shuffle bytes=22876
13/07/12 21:54:01 INFO mapred.JobClient:     Reduce input records=1000
13/07/12 21:54:01 INFO mapred.JobClient:     Reduce output records=1000
13/07/12 21:54:01 INFO mapred.JobClient:     Spilled Records=2000
13/07/12 21:54:01 INFO mapred.JobClient:     CPU time spent (ms)=3780
13/07/12 21:54:01 INFO mapred.JobClient:     Physical memory (bytes) snapshot=408850432
13/07/12 21:54:01 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=1962823680
13/07/12 21:54:01 INFO mapred.JobClient:     Total committed heap usage (bytes)=147070976
13/07/12 21:54:01 INFO mapred.JobClient:   org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter
13/07/12 21:54:01 INFO mapred.JobClient:     BYTES_READ=100000
13/07/12 21:54:01 INFO terasort.TeraSort: done
  Validate job output with teravalidate:

[iyunv@n1 lib]# hadoop jar hadoop-examples.jar teravalidate terasort-output terasort-validate
13/07/12 21:56:02 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/07/12 21:56:04 INFO mapred.FileInputFormat: Total input paths to process : 1
13/07/12 21:56:10 INFO mapred.JobClient: Running job: job_201307122107_0003
13/07/12 21:56:11 INFO mapred.JobClient:  map 0% reduce 0%
13/07/12 21:56:23 INFO mapred.JobClient:  map 100% reduce 0%
13/07/12 21:56:31 INFO mapred.JobClient:  map 100% reduce 100%
13/07/12 21:56:34 INFO mapred.JobClient: Job complete: job_201307122107_0003
13/07/12 21:56:34 INFO mapred.JobClient: Counters: 33
13/07/12 21:56:34 INFO mapred.JobClient:   File System Counters
13/07/12 21:56:34 INFO mapred.JobClient:     FILE: Number of bytes read=69
13/07/12 21:56:34 INFO mapred.JobClient:     FILE: Number of bytes written=310607
13/07/12 21:56:34 INFO mapred.JobClient:     FILE: Number of read operations=0
13/07/12 21:56:34 INFO mapred.JobClient:     FILE: Number of large read operations=0
13/07/12 21:56:34 INFO mapred.JobClient:     FILE: Number of write operations=0
13/07/12 21:56:34 INFO mapred.JobClient:     HDFS: Number of bytes read=100116
13/07/12 21:56:34 INFO mapred.JobClient:     HDFS: Number of bytes written=0
13/07/12 21:56:34 INFO mapred.JobClient:     HDFS: Number of read operations=3
13/07/12 21:56:34 INFO mapred.JobClient:     HDFS: Number of large read operations=0
13/07/12 21:56:34 INFO mapred.JobClient:     HDFS: Number of write operations=2
13/07/12 21:56:34 INFO mapred.JobClient:   Job Counters
13/07/12 21:56:34 INFO mapred.JobClient:     Launched map tasks=1
13/07/12 21:56:34 INFO mapred.JobClient:     Launched reduce tasks=1
13/07/12 21:56:34 INFO mapred.JobClient:     Data-local map tasks=1
13/07/12 21:56:34 INFO mapred.JobClient:     Total time spent by all maps in occupied slots (ms)=14493
13/07/12 21:56:34 INFO mapred.JobClient:     Total time spent by all reduces in occupied slots (ms)=6647
13/07/12 21:56:34 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/07/12 21:56:34 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/07/12 21:56:34 INFO mapred.JobClient:   Map-Reduce Framework
13/07/12 21:56:34 INFO mapred.JobClient:     Map input records=1000
13/07/12 21:56:34 INFO mapred.JobClient:     Map output records=2
13/07/12 21:56:34 INFO mapred.JobClient:     Map output bytes=54
13/07/12 21:56:34 INFO mapred.JobClient:     Input split bytes=116
13/07/12 21:56:34 INFO mapred.JobClient:     Combine input records=0
13/07/12 21:56:34 INFO mapred.JobClient:     Combine output records=0
13/07/12 21:56:34 INFO mapred.JobClient:     Reduce input groups=2
13/07/12 21:56:34 INFO mapred.JobClient:     Reduce shuffle bytes=65
13/07/12 21:56:34 INFO mapred.JobClient:     Reduce input records=2
13/07/12 21:56:34 INFO mapred.JobClient:     Reduce output records=0
13/07/12 21:56:34 INFO mapred.JobClient:     Spilled Records=4
13/07/12 21:56:34 INFO mapred.JobClient:     CPU time spent (ms)=1640
13/07/12 21:56:34 INFO mapred.JobClient:     Physical memory (bytes) snapshot=250499072
13/07/12 21:56:34 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=1310330880
13/07/12 21:56:34 INFO mapred.JobClient:     Total committed heap usage (bytes)=81399808
13/07/12 21:56:34 INFO mapred.JobClient:   org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter
13/07/12 21:56:34 INFO mapred.JobClient:     BYTES_READ=100000
  Hadoop provides a very convenient way to access statistics about a job from the command line:

$ hadoop job -history all terasort-output
   Also you can see the detailed result via Hadoop JobTracker web UI.
  NameNode benchmark (nnbench)
  NNBench is useful for load testing the NameNode hardware and configuration. It generates a lot of HDFS-related requests with normally very small "payloads" for the sole purpose of putting a high HDFS management stress on the NameNode. The benchmark can simulate requests for creating, reading, renaming and deleting files on HDFS.
  The syntax of NNBench is as follows:

[iyunv@n1 lib]# hadoop jar /opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/hadoop-0.20-mapreduce/hadoop-test.jar nnbench
NameNode Benchmark 0.4
Usage: nnbench <options>
Options:
-operation <Available operations are create_write open_read rename delete. This option is mandatory>
* NOTE: The open_read, rename and delete operations assume that the files they operate on, are already available. The create_write operation must be run before running the other operations.
-maps <number of maps. default is 1. This is not mandatory>
-reduces <number of reduces. default is 1. This is not mandatory>
-startTime <time to start, given in seconds from the epoch. Make sure this is far enough into the future, so all maps (operations) will start at the same time>. default is launch time + 2 mins. This is not mandatory
-blockSize <Block size in bytes. default is 1. This is not mandatory>
-bytesToWrite <Bytes to write. default is 0. This is not mandatory>
-bytesPerChecksum <Bytes per checksum for the files. default is 1. This is not mandatory>
-numberOfFiles <number of files to create. default is 1. This is not mandatory>
-replicationFactorPerFile <Replication factor for the files. default is 1. This is not mandatory>
-baseDir <base DFS path. default is /becnhmarks/NNBench. This is not mandatory>
-readFileAfterOpen <true or false. if true, it reads the file and reports the average time to read. This is valid with the open_read operation. default is false. This is not mandatory>
-help: Display the help statement
  To run NameNode benchmark test with 6 mappers and 3 reducers:

[iyunv@n1 lib]# hadoop jar /opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/hadoop-0.20-mapreduce/hadoop-test.jar nnbench -operation create_write -maps 6 -reduces 3 -blockSize 1 -typesToWrite 0 -numberOfFiles 100 -replicationFactorPerFile 3 -readFileAfterOpen true -baseDir /benchmarks/NNBench-`hostname -s`
NameNode Benchmark 0.4
13/07/12 22:13:42 INFO hdfs.NNBench: Test Inputs:
13/07/12 22:13:42 INFO hdfs.NNBench:            Test Operation: create_write
13/07/12 22:13:42 INFO hdfs.NNBench:                Start time: 2013-07-12 22:15:42,26
13/07/12 22:13:42 INFO hdfs.NNBench:            Number of maps: 6
13/07/12 22:13:42 INFO hdfs.NNBench:         Number of reduces: 3
13/07/12 22:13:42 INFO hdfs.NNBench:                Block Size: 1
13/07/12 22:13:42 INFO hdfs.NNBench:            Bytes to write: 0
13/07/12 22:13:42 INFO hdfs.NNBench:        Bytes per checksum: 1
13/07/12 22:13:42 INFO hdfs.NNBench:           Number of files: 100
13/07/12 22:13:42 INFO hdfs.NNBench:        Replication factor: 3
13/07/12 22:13:42 INFO hdfs.NNBench:                  Base dir: /benchmarks/NNBench-n1
13/07/12 22:13:42 INFO hdfs.NNBench:      Read file after open: true
13/07/12 22:13:43 INFO hdfs.NNBench: Deleting data directory
13/07/12 22:13:43 INFO hdfs.NNBench: Creating 6 control files
13/07/12 22:13:43 WARN conf.Configuration: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
13/07/12 22:13:44 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/07/12 22:13:44 INFO mapred.FileInputFormat: Total input paths to process : 6
13/07/12 22:13:44 WARN conf.Configuration: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
13/07/12 22:13:44 INFO mapred.JobClient: Running job: job_201307122107_0005
13/07/12 22:13:45 INFO mapred.JobClient:  map 0% reduce 0%
13/07/12 22:14:03 INFO mapred.JobClient:  map 33% reduce 0%
13/07/12 22:14:05 INFO mapred.JobClient:  map 67% reduce 0%
13/07/12 22:15:57 INFO mapred.JobClient:  map 83% reduce 0%
13/07/12 22:15:58 INFO mapred.JobClient:  map 100% reduce 0%
13/07/12 22:16:07 INFO mapred.JobClient:  map 100% reduce 67%
13/07/12 22:16:09 INFO mapred.JobClient:  map 100% reduce 100%
13/07/12 22:16:11 INFO mapred.JobClient: Job complete: job_201307122107_0005
13/07/12 22:16:11 INFO mapred.JobClient: Counters: 33
13/07/12 22:16:11 INFO mapred.JobClient:   File System Counters
13/07/12 22:16:11 INFO mapred.JobClient:     FILE: Number of bytes read=359
13/07/12 22:16:11 INFO mapred.JobClient:     FILE: Number of bytes written=1448711
13/07/12 22:16:11 INFO mapred.JobClient:     FILE: Number of read operations=0
13/07/12 22:16:11 INFO mapred.JobClient:     FILE: Number of large read operations=0
13/07/12 22:16:11 INFO mapred.JobClient:     FILE: Number of write operations=0
13/07/12 22:16:11 INFO mapred.JobClient:     HDFS: Number of bytes read=1530
13/07/12 22:16:11 INFO mapred.JobClient:     HDFS: Number of bytes written=182
13/07/12 22:16:11 INFO mapred.JobClient:     HDFS: Number of read operations=21
13/07/12 22:16:11 INFO mapred.JobClient:     HDFS: Number of large read operations=0
13/07/12 22:16:11 INFO mapred.JobClient:     HDFS: Number of write operations=4006
13/07/12 22:16:11 INFO mapred.JobClient:   Job Counters
13/07/12 22:16:11 INFO mapred.JobClient:     Launched map tasks=6
13/07/12 22:16:11 INFO mapred.JobClient:     Launched reduce tasks=3
13/07/12 22:16:11 INFO mapred.JobClient:     Data-local map tasks=6
13/07/12 22:16:11 INFO mapred.JobClient:     Total time spent by all maps in occupied slots (ms)=498450
13/07/12 22:16:11 INFO mapred.JobClient:     Total time spent by all reduces in occupied slots (ms)=24054
13/07/12 22:16:11 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/07/12 22:16:11 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/07/12 22:16:11 INFO mapred.JobClient:   Map-Reduce Framework
13/07/12 22:16:11 INFO mapred.JobClient:     Map input records=6
13/07/12 22:16:11 INFO mapred.JobClient:     Map output records=44
13/07/12 22:16:11 INFO mapred.JobClient:     Map output bytes=974
13/07/12 22:16:11 INFO mapred.JobClient:     Input split bytes=786
13/07/12 22:16:11 INFO mapred.JobClient:     Combine input records=0
13/07/12 22:16:11 INFO mapred.JobClient:     Combine output records=0
13/07/12 22:16:11 INFO mapred.JobClient:     Reduce input groups=8
13/07/12 22:16:11 INFO mapred.JobClient:     Reduce shuffle bytes=1227
13/07/12 22:16:11 INFO mapred.JobClient:     Reduce input records=44
13/07/12 22:16:11 INFO mapred.JobClient:     Reduce output records=8
13/07/12 22:16:11 INFO mapred.JobClient:     Spilled Records=88
13/07/12 22:16:11 INFO mapred.JobClient:     CPU time spent (ms)=16050
13/07/12 22:16:11 INFO mapred.JobClient:     Physical memory (bytes) snapshot=1233637376
13/07/12 22:16:11 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=8789716992
13/07/12 22:16:11 INFO mapred.JobClient:     Total committed heap usage (bytes)=525942784
13/07/12 22:16:11 INFO mapred.JobClient:   org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter
13/07/12 22:16:11 INFO mapred.JobClient:     BYTES_READ=228
13/07/12 22:16:11 INFO hdfs.NNBench: -------------- NNBench -------------- :
13/07/12 22:16:11 INFO hdfs.NNBench:                                Version: NameNode Benchmark 0.4
13/07/12 22:16:11 INFO hdfs.NNBench:                            Date & time: 2013-07-12 22:16:11,562
13/07/12 22:16:11 INFO hdfs.NNBench:
13/07/12 22:16:11 INFO hdfs.NNBench:                         Test Operation: create_write
13/07/12 22:16:11 INFO hdfs.NNBench:                             Start time: 2013-07-12 22:15:42,26
13/07/12 22:16:11 INFO hdfs.NNBench:                            Maps to run: 6
13/07/12 22:16:11 INFO hdfs.NNBench:                         Reduces to run: 3
13/07/12 22:16:11 INFO hdfs.NNBench:                     Block Size (bytes): 1
13/07/12 22:16:11 INFO hdfs.NNBench:                         Bytes to write: 0
13/07/12 22:16:11 INFO hdfs.NNBench:                     Bytes per checksum: 1
13/07/12 22:16:11 INFO hdfs.NNBench:                        Number of files: 100
13/07/12 22:16:11 INFO hdfs.NNBench:                     Replication factor: 3
13/07/12 22:16:11 INFO hdfs.NNBench:             Successful file operations: 0
13/07/12 22:16:11 INFO hdfs.NNBench:
13/07/12 22:16:11 INFO hdfs.NNBench:         # maps that missed the barrier: 0
13/07/12 22:16:11 INFO hdfs.NNBench:                           # exceptions: 0
13/07/12 22:16:11 INFO hdfs.NNBench:
13/07/12 22:16:11 INFO hdfs.NNBench:                TPS: Create/Write/Close: 0
13/07/12 22:16:11 INFO hdfs.NNBench: Avg exec time (ms): Create/Write/Close: 0.0
13/07/12 22:16:11 INFO hdfs.NNBench:             Avg Lat (ms): Create/Write: NaN
13/07/12 22:16:11 INFO hdfs.NNBench:                    Avg Lat (ms): Close: NaN
13/07/12 22:16:11 INFO hdfs.NNBench:
13/07/12 22:16:11 INFO hdfs.NNBench:                  RAW DATA: AL Total #1: 0
13/07/12 22:16:11 INFO hdfs.NNBench:                  RAW DATA: AL Total #2: 0
13/07/12 22:16:11 INFO hdfs.NNBench:               RAW DATA: TPS Total (ms): 0
13/07/12 22:16:11 INFO hdfs.NNBench:        RAW DATA: Longest Map Time (ms): 0.0
13/07/12 22:16:11 INFO hdfs.NNBench:                    RAW DATA: Late maps: 0
13/07/12 22:16:11 INFO hdfs.NNBench:              RAW DATA: # of exceptions: 0
13/07/12 22:16:11 INFO hdfs.NNBench:

  Look at the trick we did here, I use a custom output directory based on the machine's short hostname `hostname -s`. This is simple trick to ensure that one box does not accidentally write into the same output directory of another machine running nnbench at the same time.
  MapReduce benchmark (mrbench)
  MRBench loops a small job a number of times. As such it is a very complimentary benchmark to the "large-scale" TeraSort benchmark suite because MRBench checks whether small job runs are responsive and running efficiently on your cluster. It puts its focus on the MapReduce layer as its impact on the HDFS layer is very limited.
  Default parameters of mrbench is:

-baseDir: /benchmarks/MRBench  [*** see my note above ***]
-numRuns: 1
-maps: 2
-reduces: 1
-inputLines: 1
-inputType: ascending
  Run mrbench with default parameters:

[iyunv@n1 lib]# hadoop jar /opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/hadoop-0.20-mapreduce/hadoop-test.jar mrbench
MRBenchmark.0.0.2
13/07/12 22:04:42 INFO mapred.MRBench: creating control file: 1 numLines, ASCENDING sortOrder
13/07/12 22:04:42 INFO mapred.MRBench: created control file: /benchmarks/MRBench/mr_input/input_-1751865361.txt
13/07/12 22:04:43 INFO mapred.MRBench: Running job 0: input=hdfs://n1.example.com:8020/benchmarks/MRBench/mr_input output=hdfs://n1.example.com:8020/benchmarks/MRBench/mr_output/output_-1484101927
13/07/12 22:04:43 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/07/12 22:04:44 INFO mapred.FileInputFormat: Total input paths to process : 1
13/07/12 22:04:47 INFO mapred.JobClient: Running job: job_201307122107_0004
13/07/12 22:04:49 INFO mapred.JobClient:  map 0% reduce 0%
13/07/12 22:05:41 INFO mapred.JobClient:  map 50% reduce 0%
13/07/12 22:05:48 INFO mapred.JobClient:  map 100% reduce 0%
13/07/12 22:05:58 INFO mapred.JobClient:  map 100% reduce 100%
13/07/12 22:06:00 INFO mapred.JobClient: Job complete: job_201307122107_0004
13/07/12 22:06:00 INFO mapred.JobClient: Counters: 33
13/07/12 22:06:00 INFO mapred.JobClient:   File System Counters
13/07/12 22:06:00 INFO mapred.JobClient:     FILE: Number of bytes read=27
13/07/12 22:06:00 INFO mapred.JobClient:     FILE: Number of bytes written=468313
13/07/12 22:06:00 INFO mapred.JobClient:     FILE: Number of read operations=0
13/07/12 22:06:00 INFO mapred.JobClient:     FILE: Number of large read operations=0
13/07/12 22:06:00 INFO mapred.JobClient:     FILE: Number of write operations=0
13/07/12 22:06:00 INFO mapred.JobClient:     HDFS: Number of bytes read=261
13/07/12 22:06:00 INFO mapred.JobClient:     HDFS: Number of bytes written=3
13/07/12 22:06:00 INFO mapred.JobClient:     HDFS: Number of read operations=5
13/07/12 22:06:00 INFO mapred.JobClient:     HDFS: Number of large read operations=0
13/07/12 22:06:00 INFO mapred.JobClient:     HDFS: Number of write operations=2
13/07/12 22:06:00 INFO mapred.JobClient:   Job Counters
13/07/12 22:06:00 INFO mapred.JobClient:     Launched map tasks=2
13/07/12 22:06:00 INFO mapred.JobClient:     Launched reduce tasks=1
13/07/12 22:06:00 INFO mapred.JobClient:     Data-local map tasks=2
13/07/12 22:06:00 INFO mapred.JobClient:     Total time spent by all maps in occupied slots (ms)=50958
13/07/12 22:06:00 INFO mapred.JobClient:     Total time spent by all reduces in occupied slots (ms)=7753
13/07/12 22:06:00 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/07/12 22:06:00 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/07/12 22:06:00 INFO mapred.JobClient:   Map-Reduce Framework
13/07/12 22:06:00 INFO mapred.JobClient:     Map input records=1
13/07/12 22:06:00 INFO mapred.JobClient:     Map output records=1
13/07/12 22:06:00 INFO mapred.JobClient:     Map output bytes=5
13/07/12 22:06:00 INFO mapred.JobClient:     Input split bytes=258
13/07/12 22:06:00 INFO mapred.JobClient:     Combine input records=0
13/07/12 22:06:00 INFO mapred.JobClient:     Combine output records=0
13/07/12 22:06:00 INFO mapred.JobClient:     Reduce input groups=1
13/07/12 22:06:00 INFO mapred.JobClient:     Reduce shuffle bytes=39
13/07/12 22:06:00 INFO mapred.JobClient:     Reduce input records=1
13/07/12 22:06:00 INFO mapred.JobClient:     Reduce output records=1
13/07/12 22:06:00 INFO mapred.JobClient:     Spilled Records=2
13/07/12 22:06:00 INFO mapred.JobClient:     CPU time spent (ms)=2920
13/07/12 22:06:00 INFO mapred.JobClient:     Physical memory (bytes) snapshot=398467072
13/07/12 22:06:00 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=3889000448
13/07/12 22:06:00 INFO mapred.JobClient:     Total committed heap usage (bytes)=204607488
13/07/12 22:06:00 INFO mapred.JobClient:   org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter
13/07/12 22:06:00 INFO mapred.JobClient:     BYTES_READ=2
DataLinesMapsReducesAvgTime (milliseconds)
12177797
   This means that the average finish time of executed jobs was 78 seconds.

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.iyunv.com/thread-310954-1-1.html 上篇帖子: hadoop数据排序(一) 下篇帖子: Hadoop常见问题及解决办法
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表