process-exporter
process-exporter copied to clipboard
Uptime graph
Currently, the stat oldest_start_time_seconds
can be used to display the start time of the (oldest) process in a group.
However, there's no way to use this in a graph and produce meaningful results - the best I can do is something like time()-namedprocess_namegroup_oldest_start_time_seconds{groupname="some_process"}
as a SingleStat to show the current "uptime".
I'd like to be able to graph the uptime of a process, but if I use a similar forumla for a graph, it would just continually increase, even after a process restarted. I think the only way to get this information would be if process-exporter itself did the calcualation, and exposed uptime
as now()-starttime(22)
If there's a way I can do this in Graphana without changes to process-exporter, that would be awesome!
I'm confused: how would doing the calculation in the app instead of in promql help with the graphing?
Personally I'm only ever interested in the uptime to know if the app was restarted, which you can get with changes(namedprocess_namegroup_oldest_start_time_seconds[interval]).
At the moment, namedprocess_namegroup_oldest_start_time_seconds
would have the following data points (using very simplified examples):
Timestamp | start_time_seconds |
Explanation |
---|---|---|
09:00:00 | 1534924800 | Process started at 2018-08-22 08:00 |
09:30:00 | 1534924800 | Process is still running, so start time hasn't changed |
10:00:00 | 1534924800 | Process is still running, so start time hasn't changed |
10:30:00 | 1534932900 | Process restarted at 2018-08-22 10:15 |
11:00:00 | 1534932900 | Process is still running, so start time hasn't changed |
If I graph that using time()-namedprocess_namegroup_oldest_start_time_seconds
, I'd get the following points in my graph, depending on when I viewed it (since time()
is "now, when the graph is shown"):
Timestamp | Result at 11:00 | Result at 12:00 |
---|---|---|
09:00:00 | 3 hours | 4 hours |
09:30:00 | 3 hours | 4 hours |
10:00:00 | 3 hours | 4 hours |
10:30:00 | 45 minutes | 1 hour 45 mins |
11:00:00 | 45 mintues | 1 hour 45 mins |
However, if the process_exporter
information included an actual "uptime" for the process, I'd get values that were based on how long the process was running when the scrape took place, as so:
Timestamp | start_time_seconds |
uptime |
---|---|---|
09:00:00 | 1534924800 | 3600 |
09:30:00 | 1534924800 | 5400 |
10:00:00 | 1534924800 | 7200 |
10:30:00 | 1534932900 | 900 |
11:00:00 | 1534932900 | 2700 |
This would be very easy to graph, and would show a steadily rising line as processes ran, which then reset back to 0 when they were restarted.
If you can think of any other way to display this information, I'd be happy to do it in promql instead - I just can't figure out a way. I think I want scrape timestamp - start_time_seconds
, but I have no idea how to get that value.
Alternatively, if you've got a good way of showing how long processes have been running (not just how many time's they've restarted in a particular interval), I'll use that instead.
I'm currently running into the same issue - I'm trying to replace some old scripts that fed data into graphite with the process-exporter. The purpose is to show the runtime of several cronjobs, giving an indication how long each run takes and if any cronjobs are running abnormaly long. The result is a graph containing a sawtooth pattern.
So far I have not managed to formulate a promql query that will give me this result, a separate uptime gauge would be very useful here.
@Rocketeer007 have you found any solution for this in the meantime?