opencensus-go icon indicating copy to clipboard operation
opencensus-go copied to clipboard

stats/view: add Flush to immediately report all collected data points regardless of buffering

Open lychung83 opened this issue 6 years ago • 13 comments

Problem:

When exiting a program running opencensus monitoring, some of the data are not passed to exporter. This happens because as one can see in following code, reportUsage() is called only by timer, so when program exists, all data passed to opencensus after last timer firing event are simply lost.

func (w *worker) start() {
	for {
		select {
		case cmd := <-w.c:
			cmd.handleCommand(w)
		case <-w.timer.C:
			w.reportUsage(time.Now())
		case <-w.quit:
			w.timer.Stop()
			close(w.c)
			w.done <- true
			return
		}
	}
}

Solution proposal:

  1. Make func (w *worker) stop() to be (indirectly) callable by client of package.
  2. Make func (w *worker) start() to call w.reportUsage() before returning when <-w.quit is triggered.

lychung83 avatar Aug 09 '18 00:08 lychung83

Thank you for filing this issue @lychung83!

So currently, (*worker).stop() is NOT called by any function nor is case <-w.quit triggered by any exit condition. The app just exits if exiting.

I think the idiomatic fix here would to add a global function to view called say End() which will invoke (*worker)(defaultWorker).stop and that should involve passing w.reportUsage as you've suggested. However, that's going to cause a send on closed channel panic with the current stateful design of the worker.

/cc @rakyll @ramonza

odeke-em avatar Sep 04 '18 19:09 odeke-em

I think we need a global Flush function that reports all the collected points.

rakyll avatar Sep 04 '18 19:09 rakyll

@rakyll perhaps I am bikeshedding or not, on the name: I thought of Flush too, but that name implies that continued usage after its invocation is allowed, yet we want a function that is terminally invoked, hence the recommendation of End :)

odeke-em avatar Sep 04 '18 19:09 odeke-em

Flush function I was suggesting is not stopping the worker, it only flushes the unprocessed data points. Exporting a function that stops the worker might need more thinking because we never planned to give control of the worker to the user entirely.

Also, in practice, should you care about what's collected after you flushed due to the termination signal?

rakyll avatar Sep 04 '18 19:09 rakyll

Flush function I was suggesting is not stopping the worker, it only flushes the unprocessed data points.

Roger that, that would work. And in deed, users can invoke defer view.Flush() before their main exits so it kills 2 birds with 1 stone.

Also, in practice, should you care about what's collected after you flushed due to the termination signal?

@lychung83 is this an actual problem that you've encountered or were you hypothesizing based off reading the code? I too don't think it should matter, and moreoever we haven't even provided a stop method nor does the worker know when we are terminating except by the Go runtime cleaning up.

odeke-em avatar Sep 04 '18 20:09 odeke-em

To me, Flush() (not stopping the worker) seems enough. My use cases are:

  1. used at the end of the program to report any remaining data points in the worker.
  2. used for the test program to make sure all data points put by any measure.M() call to the exporter inspected by test w/o calling time.Sleep() for certain amount of time that's enough to fire worker.c

lychung83 avatar Sep 05 '18 00:09 lychung83

Okay, am retitling this issue then as Flush is request that is different from the original request of when a program is exiting.

odeke-em avatar Sep 20 '18 22:09 odeke-em

This will solve https://github.com/census-ecosystem/opencensus-go-exporter-stackdriver/issues/43.

razvanh avatar Sep 21 '18 18:09 razvanh

I don't think we should have a separate function to flush each Exporter and one to flush the internal stats buffer. We should have a single global Flush/Shutdown/Close function that only returns when everything has been flushed. It's also important that things get flushed in the right order: flushing the internal stats before flushing the buffer in the exporter will not work as intended. Users should not have to find this out themselves.

semistrict avatar Sep 25 '18 19:09 semistrict

For metrics, concept of Reader was added recently. With reader one can call ReadAndExport to export all data on program termination. It would work same as flush.

rghetia avatar Mar 27 '19 05:03 rghetia

Hi @odeke-em, I noticed one of your commits mentioned in this issue addresses this problem directly, so I'm just wondering if you'll eventually open it as a PR or if there's anything blocking it. Thank you! 🎉

marwan-at-work avatar Nov 02 '19 20:11 marwan-at-work

Hey Marwan, Thank you for the ping! I did open up a PR but it got blocked because there was no consensus/movement to add it via OpenCensus Specs and I believe these libraries are now on auto pilot mode and no longer adding features so unfortunately I don’t think it’ll be done. My PR was sent for over a year.

On Sat, Nov 2, 2019 at 1:27 PM Marwan Sulaiman [email protected] wrote:

Hi @odeke-em https://github.com/odeke-em, I noticed one of the commits mentioned in this issue addresses this issue directly, just wondering if you'll eventually open it as a PR or if there's anything blocking it. Thank you! 🎉

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/census-instrumentation/opencensus-go/issues/862?email_source=notifications&email_token=ABFL3V72S2566O257GETAJLQRXPEDA5CNFSM4FOTOTXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEC5EJYI#issuecomment-549078241, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFL3V47TL2GLTABDLWOVDDQRXPEDANCNFSM4FOTOTXA .

odeke-em avatar Nov 02 '19 20:11 odeke-em

@odeke-em Thank you for the response. This seems to me more of a bug than a feature since OpenCensus swallows metrics when a process exists.

In the same spirit, does that mean all other features and bug requests are frozen?

I recently opened this issue which, if my assumption is correct, would be quite noteworthy to address: https://github.com/census-instrumentation/opencensus-go/issues/1181

Thanks again :)

marwan-at-work avatar Nov 02 '19 21:11 marwan-at-work