1.关于CPU负载。extop显示的结果
如果CPU load average>=1,说明主机过载了。
如果PCPU used%在80%左右说明良好,90%以上就临近过载了。
VM赋予过多的vCPU可能消耗更多资源,尤其是在负载很重的主机上。
例如,单线程负载运行在多个vCPU的VM上或者多线程负载运行在一个超过其需要的vCPU数量的VM上。
Most guest operating systems execute an idle loop during periods of inactivity. Within this loop,
most of these guest operating systems halt by executing the HLT or MWAIT instructions. Some older
guest operating systems (including Windows 2000 (with certain HALs), Solaris 8 and 9, and
MS-DOS), however, use busy-waiting within their idle loops. This results in the consumption of
resources that might otherwise be available for other uses (other virtual machines, the VMkernel, and
so on).
ESXi automatically detects these loops and de-schedules the idle vCPU. Though this reduces the CPU
overhead, it can also reduce the performance of some I/O-heavy workloads. For additional
information see VMware KB articles 1077 and 2231.
The guest operating system’s scheduler might migrate a single-threaded workload amongst multiple
vCPUs, thereby losing cache locality
2.建议启用多线程
cpu0和cpu1标识第一个core,cpu2和cpu3标示第二个core,依次类推。
多线程和cpu关联性:将vCPU分配到同一物理CPU上的2个逻辑处理器会引起性能问题。
3.NUMA
ESXi会只能管理。NUMA调度器和内存安排策略。可以手工覆盖系统自动设置,但是一般最好不动,改用Memoriy affinity来设置内存管理或者cpu资源管理来设置cpu资源。
By default, ESXi NUMA scheduling and related optimizations are enabled only on systems with a total of at
least four CPU cores and with at least two CPU cores per NUMA node
缺省情况下ESXi NUMA调度器和优化会在系统具备至少4个core,每个NUMA node至少2个core的情况下启用。
两种情况:A.VM上的vCPU少于或者等于NUMA Node的core。
全部分配到一个node上,用本地内存
B.VM上的vCPU多于NUMA Node的core
称作wide VM. vCPU会分配到2个或以上的Node上。会有访问远程内存的延迟。可使用vNUMA缓和-让vm guest参与到本地内存管理任务
因此B比A具有性能优势,但是相反的,对于内存带宽瓶颈的负载来说B可以充分利用内存带宽。对于A上无法分配到多个Node的情形。
可以修改numa.vcpu.maxPerMachineNode来达到增加带宽的好处。