My VMTurbo dashboard tells me my VDI cluster is beginning to suffer from Critical Memory Congestion and is now recommending that I provision two more hosts. This message began reporting on Saturday (a relatively low utilization time).
Here is our environment layout:
We have seven hosts consisting of two types.
Hosts 1 - 4 = 2 socket x 4 core, Intel E5-2643 (Sandy Bridge) @ 3.30 GHz with 128 GB RAM
Hosts 5 - 7 = 2 socket x 10 core, Intel E5-2680 v2 (Sandy Bridge v2) @ 2.80 GHz with 384 GB RAM
Our VDI VMs have 1 vcore per socket with 2 vsockets and 2.5 GB RAM. We have roughly 327 running VMs and average about 240 peak users connected at a time.
For storage, we average under 3 ms latency with peaks under 15 ms. We push 750 to 1500 iops steady state per host throughout the morning.
Our overall memory and individual host memory consumption is peaking around 60-61% and fluctuates between 40% and 61% for all hosts. Active Memory is around 1/3 of that total with moderate fluctuation +/- 5%.
The incongruence I see is that the 'Cluster Capacity' dashboard indicates the cluster can host another 300 VMs before being overloaded, the 'Plan - Workload Distribution' view appears to show that the cluster can withstand adding another 150+ VMs (and four more storage volumes) before reaching near 80% load conditions, and the individual hosts don't appear to be showing memory congestion issues in real time monitoring. However, the VMTurbo Dashboard is showing this memory congestion message.
What's the real story?