Logical Disk free_bytes and size_bytes counter not updating instantly
Hi,
Ok this one is a weird one, I noticed that the value of the windows_logical_disk_size_bytes counter was not updating instantly after a drive extension. I then looked at the LogicalDisk > Free Megabytes windows counter (you probably use the raw version of this one or one of is friends) and it is not updating either.
If you wait for a bit (10-15min), the counter (on windows and on the exporter) end up updating. If you restart the exporter service, it looks liks it updates instantly (might have to test this one further).
Digging a little bit I found this old article https://www.ibm.com/support/pages/freembytes-not-matching-value-perfmon that gave me a hint.
From my understanding some disk performance counter are considered costly to calculate and are not updated in real time. It looks like the drive free space / disk sizes are part of them. The questions are:
- Is this behavior intended for the exporter (is it the same on the node_exporter?)
- Is there a way to do better (without altering the registry?) The API call return the correct drive size instantly. (It might be overkill to call it every time but updating the value once a minute with a cache shoule be enough.)
My main goal is to point the current behaviors to have it at least documented and then if there is a way to do better, great!
This is a very good find! I think at a minimum we should add this to the logical_disk collector documentation. Would you mind submitting a PR for the documentation?
I don't think changing the behavior of a particular collector to cache a result or return a value only every minute would be ideal, but perhaps there needs to be some discussion first.
This is a very good find! I think at a minimum we should add this to the
logical_diskcollector documentation. Would you mind submitting a PR for the documentation?
Do you think I shoud add a comment directly on the counter description or a warning at the end?
I think a brief note in the counter description, and a more detailed description in the collector documentation.
Ok forgive me if I didn't do it right but I tried a edit of the doc there: https://github.com/prometheus-community/windows_exporter/pull/846
Ho ok, again this DCO stuff, sorry I don’t understand why you can’t do an simple modification like that directly from the website…
You might be able to add the DCO via the website. If you add a Signed-off-by at the bottom of the commit description it may pass.
E.G. Signed-off-by: Ben Reedy <[email protected]>
If not you'll have to use git commit -s --amend to amend the commit with the sign-off.
Now that the documentation is updated. There is the enhance option left.
I know Win32 and .NET but unfortunately, I don’t know anything about GO, so I can’t really help there.
I can confirm that the free_bytes metric doesn't update in real time.
The graph bellow show a clear 5 min step increase on this server even though the scrap interval is done every 15 seconds and the data injection was done at a constant rate:

The impact of this delay is different depending on the metric:
- Delay on
size_bytes: Mean than an alert will stay active more time than it should > Impact low - Delay on
free_bytes: Means that an alert might be delayed for 5-10 min > Impact very high on some of our servers
Hmm, are there any other sources we could use for these metrics, that update more frequently? While more recent metrics are preferable, increased resource usage and/or metric caching wouldn't be desirable.
I think most monitoring system rely directly on API calls, I would say: https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getdiskfreespaceexa https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getvolumeinformationw One of the benefits would also be the ability to get the volume name. As I don’t know GO, I have no idea if a library exists to cover those but I agree that a cache of 1min sound reasonable to prevent resource usage.
I've had a look and we're currently using the windows library for the service collector.
This library exposes GetDiskFreeSpaceEx and GetVolumeInformation functions which we can use. I'm not familiar with the win32 API, so I'll need to find usage examples or someone else can try to make use of the functions.
We also need to consider the caching implementation, which I don't believe has been implemented for this exporter.
This issue has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs.
The 5 min delay to spot issue have an impact on alerts reactivity
This issue has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs.