kvm性能测试
kvm性能测试测试的主要目的是观察kvm上安装的虚拟机对物力资源的性能损耗。这次主要是对kvm虚拟化的cpu,内存和io进行性能对比测试,具体方法是:在非虚拟化的原生系统中执行某个基准测试程序,然后将该测试程序放到与原生系统配置相近的虚拟客户机中执行,接着对比在虚拟化和非虚拟化环境中该测试程序执行的性能。为了测试的准确性,尽量保证测试环境和原生系统环境的一致性。在/etc/grub/grub.cfg文件中,在启动内核的那一行添加maxcpus=2 nr_cpus=2 mem=2G这几个参数即可限制linux内核加载的cpu核心数和内存大小。
1
2
3
4
set root='(hd0,msdos1)'
search --no-floppy --fs-uuid --set=root 3940bb4d-c220-4cb5-b4f5-6dd11c5ecb44
linux /boot/vmlinuz-3.2.0-83-generic root=UUID=3940bb4d-c220-4cb5-b4f5-6dd11c5ecb44 ro quiet maxcpus=2 nr_cpus=2 mem=2G
initrd/boot/initrd.img-3.2.0-83-generic
上面是ubuntu中的/etc/grub/grub.cfg文件内容,redhat系的略有区别。
原生系统和虚拟机上的系统都是1颗cpu,2个核心,2G的内存。由于对io测试时,仅读取512M的大小进行测试,所以物理机和虚拟机上磁盘大小的区别影响不大。
cpu性能测试
对cpu的性能测试选择Super PI这个工具,本次Super PI的基准测试中选择计算圆周率π的小数点后面2的20次方个数据位和2的24次方个数据位.在计算完成后,程序会输出本次计算所花费的时间。命令如下:
1
2
root@ubuntu:~/super_pi# ./super_pi 20
...
1
2
root@ubuntu:~/super_pi# ./super_pi 24
...
在x86_64架构的系统上运行Super PI执行程序,可能会找不到ld-linux.so.2共享库,这是由于Super PI程序比较老,在ubuntu上安装下libc6-i386包即可。
1
2
3
4
root@ubuntu:~/super_pi# apt-cache search libc6-i386
libc6-i386 - Embedded GNU C Library: 32-bit shared libraries for AMD64
root@ubuntu:~/super_pi# apt-get install libc6-i386
...
程序运行结束后,会输出Total calculation(I/O) time:
./pi 20第一次测试第二次第三次第四次第五次
host_ubuntu12.03711.78511.74411.91111.852
virt_ubuntu11.98611.92511.99412.0411.919
./pi 24
host_ubuntu333.967332.558331.512335.048331.745
virt_ubuntu342.457342.003339.275342.685343.375
通过比较可以看出kvm虚拟化中cpu性能为原生系统的97%左右。
内存性能测试
内存的测试使用LMbench这款工具,LMbench中包含很多简单的基准测试,覆盖了文档读写、内存操作、管道、系统调用、上下文切换、进程创建和销毁、网络等多方面的性能测试。另外,LMbench能够对同级别的系统进行比较测试,反映不同系统的优劣势,通过选择不同的库函数就能够比较库函数的性能。
接下来从网上下载LMbench,下载得到lmbench3.tar.gz,解压之后,运行make即可进行编译。
1
2
3
4
root@ubuntu:/home/luyi# tar -zx -f lmbench3.tar.gz -C lmbench3
root@ubuntu:/home/luyi# cd lmbench3/
root@ubuntu:/home/luyi/lmbench3# make
...
在编译过程中可能会遇到如下错误提示:
1
2
3
4
5
make: *** No rule to make target `../SCCS/s.ChangeSet', needed by `bk.ver'.Stop.
make: Leaving directory `/home/luyi/lmbench3/lmbench3/src'
make: *** Error 2
make: Leaving directory `/home/luyi/lmbench3/lmbench3/src'
make: *** Error 2
新建相关目录和文件即可绕过该错误,然后运行make results来进行测试:
1
2
3
4
5
root@ubuntu:/home/luyi/lmbench3/lmbench3# mkdir SCCS ; touch SCCS/s.ChangeSet
root@ubuntu:/home/luyi/lmbench3/lmbench3# make
...
root@ubuntu:/home/luyi/lmbench3/lmbench3# make results
...
运行make results后,在正式运行测试之前,会有一些交互式的操作以便确认测试时使用的具体配置,多数提示只需要按Enter键选择默认值即可在本次测试中,没有使用默认值的配置有3个:LMbench测试的内存值、处理器时钟频率,以及是否将测试结果发到LMbench3的官方邮箱。
cpu的时钟频率可以参考这个:
1
2
3
luyi@ubuntu:~$ cat /proc/cpuinfo | grep "model name"
model name : Pentium(R) Dual-CoreCPU E5700@ 3.00GHz
model name : Pentium(R) Dual-CoreCPU E5700@ 3.00GHz
没有使用默认值的配置:
1
2
3
4
5
6
7
MB 1024 #测试的内存越大,需要的时间越长
Checking to see if you have 1024 MB; please wait for a moment...
...
Processor mhz 3000
...
Mail results no
OK, no results mailed.
LMbench根据配置文档执行完成所需要的测试项之后,在results目录下根据系统类型、系统名和操作系统类型等生成一个子目录,测试结果文档按照“主机名+序号”的命名方式存放于该目录下。运行make see命令可以查看测试结果报告及其说明。
1
root@ubuntu:/home/luyi/lmbench3/lmbench3/results# make see
可以将测试的结果文档统一放在lmbench3/lmbench3/results/x86_64-linux-gnu目录下,然后运行make see命令即可查看到非常直观的结果对比报告。下面是测试的两组数据,原生系统上和虚拟化环境中各3次测试:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
L M B E N C H3 . 0 S U M M A R Y
------------------------------------
(Alpha software, do not distribute)
Basic system parameters
------------------------------------------------------------------------------
Host OS Description Mhztlbcachemem scal
pages line par load
bytes
--------- ------------- ----------------------- ---- ----- ----- ------ ----
baby-ubun Linux 3.16.0- x86_64-linux-gnu 2300 32 128 3.6500 1
baby-ubun Linux 3.16.0- x86_64-linux-gnu 2300 32 128 3.6000 1
baby-ubun Linux 3.16.0- x86_64-linux-gnu 2300 32 128 3.7100 1
virt-ubun Linux 3.16.0- x86_64-linux-gnu 2300 32 128 3.7800 1
virt-ubun Linux 3.16.0- x86_64-linux-gnu 2300 32 128 3.7800 1
Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host OSMhz null null open slct sigsigfork exec sh
callI/O stat clos TCPinst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
baby-ubun Linux 3.16.0- 2300 0.05 0.12 0.34 1.12 3.04 0.12 0.78 91.6 261. 607.
baby-ubun Linux 3.16.0- 2300 0.05 0.12 0.35 1.11 2.93 0.12 0.78 97.8 266. 610.
baby-ubun Linux 3.16.0- 2300 0.05 0.12 0.34 1.11 2.91 0.12 0.77 95.8 261. 605.
virt-ubun Linux 3.16.0- 2300 0.05 0.12 0.35 1.20 2.97 0.12 0.88 99.0 289. 653.
virt-ubun Linux 3.16.0- 2300 0.05 0.12 0.35 1.11 2.93 0.12 0.86 98.4 278. 641.
virt-ubun Linux 3.16.0- 2300 0.05 0.12 0.35 1.14 2.92 0.12 0.90 105. 290. 660.
Basic integer operations - times in nanoseconds - smaller is better
-------------------------------------------------------------------
Host OSintgr intgrintgrintgrintgr
bit add mul div mod
--------- ------------- ------ ------ ------ ------ ------
baby-ubun Linux 3.16.0- 0.3500 0.0500 1.0700 7.8100 8.7300
baby-ubun Linux 3.16.0- 0.3500 0.0500 1.0700 7.8300 8.7600
baby-ubun Linux 3.16.0- 0.3500 0.0500 1.0700 7.8100 8.7400
virt-ubun Linux 3.16.0- 0.3500 0.0500 1.0700 7.8200 8.7600
virt-ubun Linux 3.16.0- 0.3500 0.0500 1.0700 7.8300 8.7600
virt-ubun Linux 3.16.0- 0.3500 0.0500 1.0700 7.8500 8.7500
Basic float operations - times in nanoseconds - smaller is better
-----------------------------------------------------------------
Host OSfloatfloatfloatfloat
add mul div bogo
--------- ------------- ------ ------ ------ ------
baby-ubun Linux 3.16.0- 1.0400 1.7300 4.9600 5.0000
baby-ubun Linux 3.16.0- 1.0400 1.7300 4.9600 5.0000
baby-ubun Linux 3.16.0- 1.0400 1.7300 4.9500 5.0000
virt-ubun Linux 3.16.0- 1.0400 1.7300 4.9600 4.9700
virt-ubun Linux 3.16.0- 1.0400 1.7400 4.9700 4.9800
virt-ubun Linux 3.16.0- 1.0400 1.7300 4.9600 4.9900
Basic double operations - times in nanoseconds - smaller is better
------------------------------------------------------------------
Host OSdouble double double double
add mul div bogo
--------- ------------- ------------ ------ ------
baby-ubun Linux 3.16.0- 1.0400 1.7300 7.7200 7.6100
baby-ubun Linux 3.16.0- 1.0400 1.7300 7.7200 7.6100
baby-ubun Linux 3.16.0- 1.0400 1.7300 7.7400 7.6100
virt-ubun Linux 3.16.0- 1.0400 1.7300 7.7400 7.6300
virt-ubun Linux 3.16.0- 1.0400 1.7300 7.7400 7.6300
virt-ubun Linux 3.16.0- 1.0400 1.7300 7.7300 7.6200
Context switching - times in microseconds - smaller is better
-------------------------------------------------------------------------
Host OS2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
ctxswctxswctxsw ctxswctxsw ctxsw ctxsw
--------- ------------- ------ ------ ------ ------ ------ ------- -------
baby-ubun Linux 3.16.0- 1.2700 1.1700 1.2900 1.6000 1.9200 1.75000 2.06000
baby-ubun Linux 3.16.0- 1.2500 1.2100 1.2700 1.5600 1.9300 1.73000 2.16000
baby-ubun Linux 3.16.0- 1.2800 1.2400 1.2400 1.5800 1.9800 1.72000 2.04000
virt-ubun Linux 3.16.0- 1.2600 1.2000 1.4500 1.6200 2.2200 1.83000 2.42000
virt-ubun Linux 3.16.0- 1.2000 1.2300 1.4800 1.7000 2.1900 1.81000 2.32000
virt-ubun Linux 3.16.0- 1.2900 1.2800 1.7200 1.7900 2.5500 2.06000 2.72000
*Local* Communication latencies in microseconds - smaller is better
---------------------------------------------------------------------
Host OS 2p/0KPipe AF UDPRPC/ TCPRPC/ TCP
ctxsw UNIX UDP TCP conn
--------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
baby-ubun Linux 3.16.0- 1.270 3.210 4.5113.2 15.6 28.
baby-ubun Linux 3.16.0- 1.250 3.211 4.3613.0 15.3 27.
baby-ubun Linux 3.16.0- 1.280 3.266 4.3813.2 15.6 27.
virt-ubun Linux 3.16.0- 1.260 3.230 4.59 7.849 10.9 18.
virt-ubun Linux 3.16.0- 1.200 3.095 4.31 7.716 31.5 33.
virt-ubun Linux 3.16.0- 1.290 3.373 4.61 7.964 11.2 32.
File & VM system latencies in microseconds - smaller is better
-------------------------------------------------------------------------------
Host OS 0K File 10K File Mmap Prot Page 100fd
Create Delete Create Delete Latency FaultFaultselct
--------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
baby-ubun Linux 3.16.0- 7.3678 5.6366 16.6 9.13316604.0 0.234 0.21320 1.222
baby-ubun Linux 3.16.0- 7.0883 5.6481 16.8 9.15366637.0 0.239 0.21540 1.221
baby-ubun Linux 3.16.0- 7.0470 5.7672 16.2 8.97076625.0 0.248 0.21420 1.223
virt-ubun Linux 3.16.0- 7.1741 5.8160 16.4 9.06387085.0 0.297 0.23500 1.224
virt-ubun Linux 3.16.0- 7.0921 5.7670 16.4 9.09667162.0 0.296 0.23360 1.225
virt-ubun Linux 3.16.0- 7.1873 5.8700 16.9 9.21177485.0 0.258 0.28000 1.227
*Local* Communication bandwidths in MB/s - bigger is better
-----------------------------------------------------------------------------
Host OSPipe AF TCPFile MmapBcopyBcopyMem Mem
UNIX reread reread (libc) (hand) read write
--------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- -----
baby-ubun Linux 3.16.0- 5196 5689 4410 5486.4 8603.2 4637.3 3200.9 7920 4828.
baby-ubun Linux 3.16.0- 5205 5639 4350 5463.1 8593.6 4634.6 3201.2 7917 4827.
baby-ubun Linux 3.16.0- 5199 5648 4473 5472.4 8599.0 4636.9 3201.2 7920 4828.
virt-ubun Linux 3.16.0- 4810 5518 3499 6370.311.0K 3176.1 3144.7 7828 4745.
virt-ubun Linux 3.16.0- 4909 3711 6013.210.2K 3179.5 3145.5 7795 4739.
virt-ubun Linux 3.16.0- 4587 5221 3606 5524.6 8882.7 3153.5 3116.3 7750 4688.
Memory latencies in nanoseconds - smaller is better
(WARNING - may not be correct, check graphs)
------------------------------------------------------------------------------
Host OS Mhz L1 $ L2 $ Main mem Rand mem Guesses
--------- ------------- --- ---- ---- -------- -------- -------
baby-ubun Linux 3.16.0-2300 1.3830 4.1490 20.9 76.8
baby-ubun Linux 3.16.0-2300 1.3830 4.1500 21.4 78.7
baby-ubun Linux 3.16.0-2300 1.3830 4.1490 21.4 77.5
virt-ubun Linux 3.16.0-2300 1.3850 5.4010 21.4 122.5
virt-ubun Linux 3.16.0-2300 1.3860 4.3540 22.0 125.5
virt-ubun Linux 3.16.0-2300 1.3850 4.1840 22.4 125.8
从上面的测试结果可以看出,kvm虚拟化中内存的带宽和延迟,与原生系统相比都比较接近的。所以,可以粗略的得出结论:在硬件提供的内存虚拟化技术(如Intel的EPT)支持下,QEMU/KVM的内存虚拟化性能比较良好,可以达到原生系统95%以上的性能。
磁盘I/O性能测试
采用IOzone工具来进行测试,IOzone可以通过多种文件系统操作(如普通的读写、重读、重写、随机的读写)来衡量一个文件系统的性能。
下载IOzone源代码,解压后进入iozone3_414/src/current目录下运行make linux-AMD64命令即可编译。编译完成后,当前目录就生成了iozone可执行文件。
1
root@ubuntu:/home/luyi/iozone3_414/iozone3_414/src/current# ./iozone -s 512m -r 8k -S 2048 -L 64 -I -i 0 -i 1 -i 2 -Rab iozone.xls
在上面的命令参数中,-s 512m表示用于测试的文件大小为512M,-r 8k表示一条记录的大小(一次读写操作的大小)位8kb,-S 2048表示本机的缓存大小是2048kb,-L 64表示缓存线路大小位64字节,-I表示使用直接I/O方式读写绕过也页面缓存,-i 0 -i 1 -i 2表示运行“0=write/rewrite,1=read/re-read,2=random-read/write”这三种测试,-Rab iozone.xls表示运行完整的自动模式进行测试并生成Excel格式的报告iozone.xls。其中-S、-L的值通过如下命令查询得到,这两个值也可以让IOzone自己决定:
1
2
3
4
5
root@ubuntu:/home/luyi/iozone3_414/iozone3_414/src/current# cat /proc/cpuinfo | grep cache
cache size : 2048 KB
cache_alignment : 64
cache size : 2048 KB
cache_alignment : 64
1k(一次读写操作的大小)Writer ReportRe-writer ReportReader ReportRe-reader ReportRandom Read ReportRandom Write Report
host(物理机)1.67m/s9.31m/s14.02m/s13.89m/s0.17m/s0.26m/s
virt-none(虚拟机,cache=none)1.47m/s6.71m/s7.37m/s7.65m/s0.17m/s0.25m/s
virt-default(虚拟机,cache=default(writeback))17.56m/s16.15m/s17.87m/s18.62m/s21.17m/s2.52m/s
virt-writethrough(虚拟机,cache=writethrough)0.11m/s0.11m/s21.84m/s21.83m/s21.65m/s0.12m/s
8k(一次读写操作的大小)Writer ReportRe-writer ReportReader ReportRe-reader ReportRandom Read ReportRandom Write Report
host63.15m/s67.02m/s71.75m/s70.45m/s1.10m/s1.83m/s
virt-none41.02m/s40.75m/s43.78m/s45.30m/s1.01m/s1.82m/s
virt-default125.04m/s146.71m/s161.26m/s161.11m/s160.65m/s16.01m/s
virt-writethrough0.91m/s0.91m/s164.16m/s164.19m/s163.01m/s0.85m/s
1m(..)Writer ReportRe-writer ReportReader ReportRe-reader ReportRandom Read ReportRandom Write Report
host98.03m/s98.58m/s101.67m/s101.67m/s57.26m/s61.18m/s
virt-none95.73m/s98.34m/s100.54m/s100.22m/s56.71m/s65.30m/s
virt-default168.07m/s173.64m/s2609.87m/s2676.79m/s2777.94m/s141.64m/s
virt-writethrough52.83m/s52.52m/s3283.17m/s3317.91m/s3221.16m/s40.71m/s
8m(..)Writer ReportRe-writer ReportReader ReportRe-reader ReportRandom Read ReportRandom Write Report
host100.30m/s100.80m/s102.05m/s101.88m/s92.71m/s82.89m/s
virt-none81.90m/s86.50m/s97.41m/s98.81m/s90.49m/s80.74m/s
virt-default210.12m/s171.43m/s2691.20m/s2700.01m/s2682.85m/s185.32m/s
virt-writethrough62.02m/s64.35m/s2546.63m/s2624.13m/s2663.87m/s59.44m/s
通过设置虚拟磁盘的读写方式以及测试时一次读写操作的大小得到以上数据。
虚拟磁盘的cache_mode选择none,可以绕过页面缓存(页面缓存可以大大提高虚拟磁盘的访问速度,所以当cache=writeback时虚拟磁盘的性能非常不错,但是意外断电,可能会造成数据丢失),如果是要观察虚拟磁盘的性能损耗,可以观察host和virt-none这两组数据。I/O性能的好坏与“一次完成的读写操作的大小”有关,当一次完成的读写操作的大小比较大时(1m或8m),虚拟磁盘的性能与物理磁盘的性能越是接近,当一次完成的读写操作的大小较小时(1k或8k),虚拟磁盘的性能大概是物理磁盘的60%-70%。
页:
[1]