process-exporter icon indicating copy to clipboard operation
process-exporter copied to clipboard

Uptime graph

Open Rocketeer007 opened this issue 6 years ago • 3 comments

Currently, the stat oldest_start_time_seconds can be used to display the start time of the (oldest) process in a group.

However, there's no way to use this in a graph and produce meaningful results - the best I can do is something like time()-namedprocess_namegroup_oldest_start_time_seconds{groupname="some_process"} as a SingleStat to show the current "uptime".

I'd like to be able to graph the uptime of a process, but if I use a similar forumla for a graph, it would just continually increase, even after a process restarted. I think the only way to get this information would be if process-exporter itself did the calcualation, and exposed uptime as now()-starttime(22)

If there's a way I can do this in Graphana without changes to process-exporter, that would be awesome!

Rocketeer007 avatar Aug 22 '18 08:08 Rocketeer007

I'm confused: how would doing the calculation in the app instead of in promql help with the graphing?

Personally I'm only ever interested in the uptime to know if the app was restarted, which you can get with changes(namedprocess_namegroup_oldest_start_time_seconds[interval]).

ncabatoff avatar Aug 22 '18 15:08 ncabatoff

At the moment, namedprocess_namegroup_oldest_start_time_seconds would have the following data points (using very simplified examples):

Timestamp start_time_seconds Explanation
09:00:00 1534924800 Process started at 2018-08-22 08:00
09:30:00 1534924800 Process is still running, so start time hasn't changed
10:00:00 1534924800 Process is still running, so start time hasn't changed
10:30:00 1534932900 Process restarted at 2018-08-22 10:15
11:00:00 1534932900 Process is still running, so start time hasn't changed

If I graph that using time()-namedprocess_namegroup_oldest_start_time_seconds, I'd get the following points in my graph, depending on when I viewed it (since time() is "now, when the graph is shown"):

Timestamp Result at 11:00 Result at 12:00
09:00:00 3 hours 4 hours
09:30:00 3 hours 4 hours
10:00:00 3 hours 4 hours
10:30:00 45 minutes 1 hour 45 mins
11:00:00 45 mintues 1 hour 45 mins

However, if the process_exporter information included an actual "uptime" for the process, I'd get values that were based on how long the process was running when the scrape took place, as so:

Timestamp start_time_seconds uptime
09:00:00 1534924800 3600
09:30:00 1534924800 5400
10:00:00 1534924800 7200
10:30:00 1534932900 900
11:00:00 1534932900 2700

This would be very easy to graph, and would show a steadily rising line as processes ran, which then reset back to 0 when they were restarted.

If you can think of any other way to display this information, I'd be happy to do it in promql instead - I just can't figure out a way. I think I want scrape timestamp - start_time_seconds, but I have no idea how to get that value.

Alternatively, if you've got a good way of showing how long processes have been running (not just how many time's they've restarted in a particular interval), I'll use that instead.

Rocketeer007 avatar Aug 22 '18 16:08 Rocketeer007

I'm currently running into the same issue - I'm trying to replace some old scripts that fed data into graphite with the process-exporter. The purpose is to show the runtime of several cronjobs, giving an indication how long each run takes and if any cronjobs are running abnormaly long. The result is a graph containing a sawtooth pattern.

So far I have not managed to formulate a promql query that will give me this result, a separate uptime gauge would be very useful here.

@Rocketeer007 have you found any solution for this in the meantime?

JanKoppe avatar Nov 26 '19 14:11 JanKoppe