Has anyone done any in-depth testing of Turbonomic's predictive placement versus VMwares Predictive DRS in VRops and ESXi 6.5? I am curious to see how Predictive DRS stands up against Turbonomic.
Thanks for the question! I am going to cite one of colleagues - Jimmy Herbert - who has run into a few customer running/looking into pDRS who were also curious of the difference. Please see below.
With vSphere 6.5, VMware have introduced predictive DRS (pDRS) which they position as a "game changer", and more specifically as a feature that makes DRS more comparable to Turbonomic.
Before even discussing pDRS, it is important to clarify that VMware hasn't changed the logic by which DRS takes actions; it is still based on a static standard deviation threshold that is calculated from momentary and single dimension data sets (e.g. CPU, MEM). When a deviation is detected, DRS will move a workload from one host to another (within a cluster) with the intent of balancing the measured metric between the hosts. As long as DRS operates this way, it doesn't matter if it utilizes predicted or actual loads, DRS actions remain based on load balancing logic. Furthermore, DRS still cannot understand application demand and ensure that it is continuously satisfied by infrastructure supply.
pDRS leverages the anomaly detection that is trained by operators via vROps. vROps feeds DRS with CPU or MEM spikes that were identified as normal behavior (e.g. expected increased usage). These expected (predicted) spikes will be sent to DRS an hour before they occur, and DRS will consider them as if they were real spikes (observed) and will handle them the same way.
This new method may occasionally prevent a performance issue (by relieving some of the recurring contentions faster than before), but it may make matters worse and lead to increased contentions. Just consider how DRS will behave when; the predicted load conflicts with the observed load, when the predicted load doesn't actually happen or when true contentions are interpreted by vROps as normal reoccurring spikes.
pDRS (and any other new feature added to DRS) continues to rely on the same dated approach DRS takes to determine its actions. This approach fails to assure application performance while maximizing efficiency and is fundamentally different from Turbonomic's approach of abstracting the environment into a common data model and matching real-time workload demand with the underlying shared infrastructure.
pDRS is limited in its ability to handle scale, "If You enable Predictive DRS (pDRS), Your Software warranty and support policy will not apply to any Software operating in a Cluster containing more than 4,000 Virtual Machines.
Hope this helps.
Retrieving data ...