Opserver icon indicating copy to clipboard operation
Opserver copied to clipboard

Cpu usage for SQL Server showing wrong values

Open maurmun opened this issue 5 years ago • 3 comments
trafficstars

The CPU graph is showing wrong values, which are always higher than the normal values. This is the way how values are calculated inside the RING_BUFFER_SCHEDULER_MONITOR record of the sys.dm_os_ring_buffers. in the following query

Select /* SQL\SQLInstance.CPUHistory.cs@16 */
 DateAdd(s, (timestamp - (osi.cpu_ticks / Convert(Float, (osi.cpu_ticks / osi.ms_ticks)))) / 1000, GETDATE()) AS EventTime,
	   Record.value('(./Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'int') as SystemIdle,
	   Record.value('(./Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]', 'int') as ProcessUtilization,
	   Record.value('(./Record/SchedulerMonitorEvent/SystemHealth/MemoryUtilization)[1]', 'int') as MemoryUtilization,
  From (Select timestamp, 
               convert(xml, record) As Record 
	      From sys.dm_os_ring_buffers 
		 Where ring_buffer_type = N'RING_BUFFER_SCHEDULER_MONITOR'
		   And record Like '%<SystemHealth>%') x
	    Cross Join sys.dm_os_sys_info osi
Order By timestamp

You should use 100 -Record.value('(./Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'int') to get correct values.

This discrepancy is more evident with servers with more NUMA Nodes/Cores

maurmun avatar Apr 09 '20 12:04 maurmun

I'm unclear what the bug is here: the SQL tab is showing how much CPU SQL Server itself is using, but this calculation would be how much the machine is using, which is not the same thing (and not useful for multiple instances on the same machine, for instance). Is the mismatch in expectations here?

NickCraver avatar May 10 '20 13:05 NickCraver

Well, the RING_BUFFER_SCHEDULER_MONITOR record returns wrong values for the ProcessUtilization as the number of cores/logical processor/sockets goes higher or the cores per socket goes higher. It's clear that it reports the CPU usage of SQL Server and not the whole box, but oh high-end machines it's always higher (and of a great degree) than the real usage; this value is correct only on 1 core or 1 socket boxes. Showing this value in the SQL tab of the dashboard could lead to wrong assumptions (in the case a non SQL expert is looking into it). On a 56 core box for example You can see (correctly) the CPU Usage in the Dashboard (collected with WMI) to be around 5% while the SQL tab could show values as high as 20%. On a high-end box with 192 cores running on an average of 20% that query show values also in the range of 75%-90%.

maurmun avatar May 10 '20 21:05 maurmun

https://support.solarwinds.com/SuccessCenter/s/article/CPU-utilization-is-not-displayed-properly-in-DPA

maurmun avatar May 10 '20 21:05 maurmun