KVM虚拟机内存不足,调整参数
Dec 20 21:23:45 vgfs001 kernel: tiotest_AMD_x86 invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0, oom_score_adj=0Dec 20 21:23:45 vgfs001 kernel: tiotest_AMD_x86 cpuset=/ mems_allowed=0
Dec 20 21:23:45 vgfs001 kernel: Pid: 1937, comm: tiotest_AMD_x86 Not tainted 2.6.32-431.29.2.lustre.el6.x86_64 #1
Dec 20 21:23:45 vgfs001 kernel: Call Trace:
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff810d07b1>] ? cpuset_print_task_mems_allowed+0x91/0xb0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81122b80>] ? dump_header+0x90/0x1b0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff8122894c>] ? security_real_capable_noaudit+0x3c/0x70
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81123002>] ? oom_kill_process+0x82/0x2a0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81122f41>] ? select_bad_process+0xe1/0x120
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81123440>] ? out_of_memory+0x220/0x3c0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff8112fd5f>] ? __alloc_pages_nodemask+0x89f/0x8d0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81167cea>] ? alloc_pages_current+0xaa/0x110
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff8111ff77>] ? __page_cache_alloc+0x87/0x90
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81120c8e>] ? grab_cache_page_write_begin+0x8e/0xc0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffffa0a8f228>] ? ll_write_begin+0x58/0x1a0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff811204f3>] ? generic_file_buffered_write+0x123/0x2e0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81078fd7>] ? current_fs_time+0x27/0x30
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81121f50>] ? __generic_file_aio_write+0x260/0x490
Dec 20 21:23:45 vgfs001 kernel: [<ffffffffa05211a5>] ? cl_env_info+0x15/0x20
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81122208>] ? generic_file_aio_write+0x88/0x100
Dec 20 21:23:45 vgfs001 kernel: [<ffffffffa0aa3907>] ? vvp_io_write_start+0x137/0x2a0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffffa05301da>] ? cl_io_start+0x6a/0x140
Dec 20 21:23:45 vgfs001 kernel: [<ffffffffa05348e4>] ? cl_io_loop+0xb4/0x1b0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffffa0a46306>] ? ll_file_io_generic+0x2a6/0x610
Dec 20 21:23:45 vgfs001 kernel: [<ffffffffa0a47192>] ? ll_file_aio_write+0x142/0x2c0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffffa0a4747c>] ? ll_file_write+0x16c/0x2a0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81189298>] ? vfs_write+0xb8/0x1a0
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff81189c61>] ? sys_write+0x51/0x90
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff810e204e>] ? __audit_syscall_exit+0x25e/0x290
Dec 20 21:23:45 vgfs001 kernel: [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
Dec 20 21:23:45 vgfs001 kernel: Mem-Info:
Dec 20 21:23:45 vgfs001 kernel: Node 0 DMA per-cpu:
Dec 20 21:23:45 vgfs001 kernel: CPU 0: hi: 0, btch: 1 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 1: hi: 0, btch: 1 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 2: hi: 0, btch: 1 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 3: hi: 0, btch: 1 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 4: hi: 0, btch: 1 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 5: hi: 0, btch: 1 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 6: hi: 0, btch: 1 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 7: hi: 0, btch: 1 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 8: hi: 0, btch: 1 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 9: hi: 0, btch: 1 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 10: hi: 0, btch: 1 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 11: hi: 0, btch: 1 usd: 0
Dec 20 21:23:45 vgfs001 kernel: Node 0 DMA32 per-cpu:
Dec 20 21:23:45 vgfs001 kernel: CPU 0: hi:186, btch:31 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 1: hi:186, btch:31 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 2: hi:186, btch:31 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 3: hi:186, btch:31 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 4: hi:186, btch:31 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 5: hi:186, btch:31 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 6: hi:186, btch:31 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 7: hi:186, btch:31 usd:11
Dec 20 21:23:45 vgfs001 kernel: CPU 8: hi:186, btch:31 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 9: hi:186, btch:31 usd:46
Dec 20 21:23:45 vgfs001 kernel: CPU 10: hi:186, btch:31 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 11: hi:186, btch:31 usd: 0
Dec 20 21:23:45 vgfs001 kernel: Node 0 Normal per-cpu:
Dec 20 21:23:45 vgfs001 kernel: CPU 0: hi:186, btch:31 usd: 2
Dec 20 21:23:45 vgfs001 kernel: CPU 1: hi:186, btch:31 usd: 7
Dec 20 21:23:45 vgfs001 kernel: CPU 2: hi:186, btch:31 usd:27
Dec 20 21:23:45 vgfs001 kernel: CPU 3: hi:186, btch:31 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 4: hi:186, btch:31 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 5: hi:186, btch:31 usd:39
Dec 20 21:23:45 vgfs001 kernel: CPU 6: hi:186, btch:31 usd:33
Dec 20 21:23:45 vgfs001 kernel: CPU 7: hi:186, btch:31 usd: 0
Dec 20 21:23:45 vgfs001 kernel: CPU 8: hi:186, btch:31 usd: 1
Dec 20 21:23:45 vgfs001 kernel: CPU 9: hi:186, btch:31 usd:35
Dec 20 21:23:45 vgfs001 kernel: CPU 10: hi:186, btch:31 usd:29
Dec 20 21:23:45 vgfs001 kernel: CPU 11: hi:186, btch:31 usd: 2
Dec 20 21:23:45 vgfs001 kernel: active_anon:1198006 inactive_anon:171400 isolated_anon:96
Dec 20 21:23:45 vgfs001 kernel: active_file:548228 inactive_file:548497 isolated_file:0
Dec 20 21:23:45 vgfs001 kernel: unevictable:0 dirty:899 writeback:2342 unstable:0
Dec 20 21:23:45 vgfs001 kernel: free:29297 slab_reclaimable:10639 slab_unreclaimable:376601
Dec 20 21:23:45 vgfs001 kernel: mapped:1032 shmem:0 pagetables:5613 bounce:0
Dec 20 21:23:45 vgfs001 kernel: Node 0 DMA free:15708kB min:80kB low:100kB high:120kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15320kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Dec 20 21:23:45 vgfs001 kernel: lowmem_reserve[]: 0 3512 12097 12097
Dec 20 21:23:45 vgfs001 kernel: Node 0 DMA32 free:53892kB min:19596kB low:24492kB high:29392kB active_anon:4kB inactive_anon:44kB active_file:1249260kB inactive_file:1249288kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3596496kB mlocked:0kB dirty:3436kB writeback:4180kB mapped:0kB shmem:0kB slab_reclaimable:24608kB slab_unreclaimable:689432kB kernel_stack:8kB pagetables:196kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:4212142 all_unreclaimable? no
Dec 20 21:23:45 vgfs001 kernel: lowmem_reserve[]: 0 0 8585 8585
Dec 20 21:23:45 vgfs001 kernel: Node 0 Normal free:47588kB min:47900kB low:59872kB high:71848kB active_anon:4792020kB inactive_anon:685556kB active_file:943652kB inactive_file:944700kB unevictable:0kB isolated(anon):384kB isolated(file):0kB present:8791040kB mlocked:0kB dirty:160kB writeback:5188kB mapped:4128kB shmem:0kB slab_reclaimable:17948kB slab_unreclaimable:816972kB kernel_stack:5040kB pagetables:22256kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:2346101 all_unreclaimable? no
Dec 20 21:23:45 vgfs001 kernel: lowmem_reserve[]: 0 0 0 0
Dec 20 21:23:45 vgfs001 kernel: Node 0 DMA: 3*4kB 2*8kB 2*16kB 1*32kB 2*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15708kB
Dec 20 21:23:45 vgfs001 kernel: Node 0 DMA32: 183*4kB 19*8kB 19*16kB 19*32kB 24*64kB 17*128kB 7*256kB 5*512kB 27*1024kB 8*2048kB 0*4096kB = 53892kB
Dec 20 21:23:45 vgfs001 kernel: Node 0 Normal: 109*4kB 185*8kB 121*16kB 43*32kB 8*64kB 117*128kB 43*256kB 8*512kB 1*1024kB 1*2048kB 2*4096kB = 47084kB
Dec 20 21:23:45 vgfs001 kernel: 1269461 total pagecache pages
Dec 20 21:23:45 vgfs001 kernel: 172616 pages in swap cache
Dec 20 21:23:45 vgfs001 kernel: Swap cache stats: add 1017139, delete 844523, find 444300/457367
Dec 20 21:23:45 vgfs001 kernel: Free swap= 3377416kB
Dec 20 21:23:45 vgfs001 kernel: Total swap = 4194300kB
Dec 20 21:23:45 vgfs001 kernel: 3145727 pages RAM
Dec 20 21:23:45 vgfs001 kernel: 96633 pages reserved
Dec 20 21:23:45 vgfs001 kernel: 9844603 pages shared
Dec 20 21:23:45 vgfs001 kernel: 528776 pages non-shared
Dec 20 21:23:45 vgfs001 kernel: [ pid ] uidtgid total_vm rss cpu oom_adj oom_score_adj name
Dec 20 21:23:45 vgfs001 kernel: 0 591 2817 4 9 -17 -1000 udevd
Dec 20 21:23:45 vgfs001 kernel: [ 2028] 02028 6899 30 0 -17 -1000 auditd
Dec 20 21:23:45 vgfs001 kernel: [ 2058] 02058 63875 54 2 0 0 rsyslogd
Dec 20 21:23:45 vgfs001 kernel: [ 2088] 02088 2740 38 7 0 0 irqbalance
Dec 20 21:23:45 vgfs001 kernel: [ 2110] 322110 4744 22 1 0 0 rpcbind
Dec 20 21:23:45 vgfs001 kernel: [ 2229] 812229 8028 9 3 0 0 dbus-daemon
Dec 20 21:23:45 vgfs001 kernel: [ 2251] 292251 5837 10 2 0 0 rpc.statd
Dec 20 21:23:45 vgfs001 kernel: [ 2281] 02281 47351 11 7 0 0 cupsd
Dec 20 21:23:45 vgfs001 kernel: [ 2317] 02317 1020 8 0 0 0 acpid
Dec 20 21:23:45 vgfs001 kernel: [ 2327] 682327 9771 123 9 0 0 hald
Dec 20 21:23:45 vgfs001 kernel: [ 2328] 02328 5100 910 0 0 hald-runner
Dec 20 21:23:45 vgfs001 kernel: [ 2370] 02370 5630 8 7 0 0 hald-addon-inpu
Dec 20 21:23:45 vgfs001 kernel: [ 2376] 682376 4502 9 0 0 0 hald-addon-acpi
Dec 20 21:23:45 vgfs001 kernel: [ 2396] 02396 96535 4211 0 0 automount
Dec 20 21:23:45 vgfs001 kernel: [ 2425] 02425 16671 8 4 -17 -1000 sshd
Dec 20 21:23:45 vgfs001 kernel: [ 2534] 02534 20331 28 4 0 0 master
Dec 20 21:23:45 vgfs001 kernel: [ 2549] 892549 20397 2910 0 0 qmgr
Dec 20 21:23:45 vgfs001 kernel: [ 2562] 02562 28661 7 1 0 0 abrtd
Dec 20 21:23:45 vgfs001 kernel: [ 2577] 02577 27116 77 6 0 0 ksmtuned
Dec 20 21:23:45 vgfs001 kernel: [ 2589] 02589 29332 21 6 0 0 crond
Dec 20 21:23:45 vgfs001 kernel: [ 2638] 02638 5394 5 4 0 0 atd
Dec 20 21:23:45 vgfs001 kernel: [ 2649] 02649 104692 1712 3 0 0 python
Dec 20 21:23:45 vgfs001 kernel: [ 2666] 02666 257137 979 3 0 0 libvirtd
Dec 20 21:23:45 vgfs001 kernel: [ 2695] 02695 27085 6 5 0 0 rhsmcertd
Dec 20 21:23:45 vgfs001 kernel: [ 2796] 992796 3223 9 7 0 0 dnsmasq
Dec 20 21:23:45 vgfs001 kernel: [ 2802] 02802 16175 7 1 0 0 certmonger
Dec 20 21:23:45 vgfs001 kernel: [ 2824] 02824 33502 11 1 0 0 gdm-binary
Dec 20 21:23:45 vgfs001 kernel: [ 2840] 02840 1016 6 3 0 0 mingetty
Dec 20 21:23:45 vgfs001 kernel: [ 2842] 02842 1016 6 7 0 0 mingetty
Dec 20 21:23:45 vgfs001 kernel: [ 2844] 02844 1016 6 4 0 0 mingetty
Dec 20 21:23:45 vgfs001 kernel: [ 2846] 02846 1016 6 4 0 0 mingetty
Dec 20 21:23:45 vgfs001 kernel: [ 2850] 02850 1016 6 4 0 0 mingetty
Dec 20 21:23:45 vgfs001 kernel: [ 2862] 02862 3212 4 9 -17 -1000 udevd
Dec 20 21:23:45 vgfs001 kernel: [ 2863] 02863 3212 4 9 -17 -1000 udevd
Dec 20 21:23:45 vgfs001 kernel: [ 2911] 02911 41157 11 6 0 0 gdm-simple-slav
Dec 20 21:23:45 vgfs001 kernel: [ 2929] 02929 35211 911 2 0 0 Xorg
Dec 20 21:23:45 vgfs001 kernel: [ 2970] 029701029163 10 1 0 0 console-kit-dae
Dec 20 21:23:45 vgfs001 kernel: [ 3040] 423040 5010 5 9 0 0 dbus-launch
Dec 20 21:23:45 vgfs001 kernel: [ 3041] 423041 7951 10 0 0 0 dbus-daemon
Dec 20 21:23:45 vgfs001 kernel: [ 3043] 423043 67404 11 8 0 0 gnome-session
Dec 20 21:23:45 vgfs001 kernel: [ 3046] 03046 12497 11 3 0 0 devkit-power-da
Dec 20 21:23:45 vgfs001 kernel: [ 3052] 423052 33326 64 0 0 0 gconfd-2
Dec 20 21:23:45 vgfs001 kernel: [ 3069] 423069 91526 3293 8 0 0 gnome-settings-
Dec 20 21:23:45 vgfs001 kernel: [ 3070] 423070 30178 56 0 0 0 at-spi-registry
Dec 20 21:23:45 vgfs001 kernel: [ 3072] 423072 89614 11 6 0 0 bonobo-activati
Dec 20 21:23:45 vgfs001 kernel: [ 3080] 423080 33821 11 8 0 0 gvfsd
Dec 20 21:23:45 vgfs001 kernel: [ 3081] 423081 72400 92 0 0 0 metacity
Dec 20 21:23:45 vgfs001 kernel: [ 3084] 423084 68544 64 2 0 0 gnome-power-man
Dec 20 21:23:45 vgfs001 kernel: [ 3085] 423085 62195 10 6 0 0 polkit-gnome-au
Dec 20 21:23:45 vgfs001 kernel: [ 3087] 423087 96302 288 0 0 0 gdm-simple-gree
Dec 20 21:23:45 vgfs001 kernel: [ 3094] 03094 13186 10 9 0 0 polkitd
Dec 20 21:23:45 vgfs001 kernel: [ 3107] 423107 86550 9 5 0 0 pulseaudio
Dec 20 21:23:45 vgfs001 kernel: [ 3109] 4993109 42114 2510 0 0 rtkit-daemon
Dec 20 21:23:45 vgfs001 kernel: [ 3114] 03114 35562 11 6 0 0 gdm-session-wor
Dec 20 21:23:45 vgfs001 kernel: 0 27425 25109 40 3 0 0 sshd
Dec 20 21:23:45 vgfs001 kernel: 0 27430 27123 80 6 0 0 bash
Dec 20 21:23:45 vgfs001 kernel: [ 1567] 0156717116091190642 1 0 0 lwfsd
Dec 20 21:23:45 vgfs001 kernel: [ 1691] 891691 20351 20 5 0 0 pickup
Dec 20 21:23:45 vgfs001 kernel: [ 1926] 01926 25227 25 8 0 0 sleep
Dec 20 21:23:45 vgfs001 kernel: [ 1927] 01927 46749 4269 7 0 0 tiotest_AMD_x86
Dec 20 21:23:45 vgfs001 kernel: Out of memory: Kill process 1567 (lwfsd) score 306 or sacrifice child
Dec 20 21:23:45 vgfs001 kernel: Killed process 1567, UID 0, (lwfsd) total-vm:6846436kB, anon-rss:4742528kB, file-rss:20040kB
这里是从Lustre的入口导致的oom,但实际上,其他入口例如KVM管理程序也可能引起oom,即任何分配内存的可能点都可能引起oom。
从分析过程来看,确实是Lustre的Cache占用了大量内存,导致内存分配不足。
三个措施。
1、增大内存
从12GB增大到16GB。
virsh setmaxmem vgfsxxx 16GB --config
运行启动后
virsh setmem vgfsxxx 16GB
这个没有用,跑了几次测试后,仍然掉服务。
2、调整lwfsd的服务优先级
设置lwfsd的服务优先级为“-17”
PID=`ps | grep lwfs | grep -v grep | awk '{print $1}'`
echo -17 > /proc/$PID/oom_adj
echo -17 > /proc/$PID/task/$PID/oom_adj
这个好像有用。
3、修改内存分配策略
并且echo "2" >/proc/sys/vm/overcommit_memory,使得分配内存时,必须存在足够的空间用于映射。
这个好像也有一定的用处。再跑跑试试。
页:
[1]