distributed
distributed copied to clipboard
Support multiple measures in `MemorySampler`
It would be handy to be able to record multiple measures at once in MemorySampler. Particularly, recording process and managed_spilled at the same time gives you a picture of both RAM and disk usage. But also tracking any of the metrics could be nice.
This is easy to do. The one small question is how to structure the DataFrame, which currently has a timeseries index, and one column per sample call. This effectively adds another dimension (measure) to the data, so using an xarray Dataset would actually be nicest here, though we could also just make a DataFrame in tidy format like xarray does.
To be used in https://github.com/coiled/coiled-runtime/issues/191.
I believe #6241 is what you're looking for :) It's mosty done, it just misses unit tests and docs.
IIUC, this issue is about even more than just memory sampling, isn't it? theoretically we could sample all kinds of metrics and plot them
IIUC, this issue is about even more than just memory sampling, isn't it? theoretically we could sample all kinds of metrics and plot them
In theory, yes. In practice you would need to implement some intelligence when mixing memory and non-memory measures in the same graph, since e.g. you clearly can't make a stacked graph out of them. You'd also need to dynamically change the y axis. It's possible, but it adds to the complexity of the tool significantly.
This tool was never meant to be fancy. If we want to write the holy grail of over-time metering, I think we should scrap this entirely and think about something in bokeh IMHO.
I believe #6241 is what you're looking for :)
Yes, it's exactly what I was looking for!