go-agent icon indicating copy to clipboard operation
go-agent copied to clipboard

Provide context-propogated methods for WaitForConnection and Shutdown

Open v3n opened this issue 3 years ago • 4 comments

Summary

Provide the following methods on Application:

ShutdownContext(ctx context.Context) error
WaitForConnectionContext(ctx context.Context) error

Desired Behaviour

Instead of passing a timeout directly, use the cancellation channel on the context to abort shutdown/waiting for conncetions.

Additional context

We handle all timeouts with contexts per Go best practice. Many of the contexts don't have deadlines but are cancellable (and get cancelled by the caller). Right now there's no way for us to pass a timeout signal in those cases. I even took a look at the internals and notices that it uses channels internally, so should be pretty easy to add.

v3n avatar Nov 16 '21 03:11 v3n

Just to be sure, can we get an understanding of your use case for these? This may be valuable insight because generally speaking, Shutdown is not necessary to call at all, and most applications should not call WaitForConnection. Unless there's a specific need to manage that yourself for a very short-lived application, it's best to let the Go Agent manage the connection behind the scenes on its own.

nr-swilloughby avatar Nov 30 '21 22:11 nr-swilloughby

Our services operate in a restricted network environment. It's fairly common to forget to declare the correct outbounds/use the correct proxy for traffic that egresses the network. All the other components of our services provide similiar Start(ctx) error methods that the service uses to determine whether or not the component came up healthy.

These responses are used to determine if we believe that we will inevitably recover or we need to immediately panic and crash.

Since New Relic starts on NewApplication(), that slightly violates our lifecycle spec, but we can use WaitForConnection with a context-propogated timeout to achieve identical functionality (and know if our service didn't start up correctly).

I'm astonished to hear Shutdown is not necessary as the docs say:

Shutdown flushes data to New Relic's servers

I consider proper finalization and flushing to be an hard requirement for any operational monitoring system–otherwise you have no insight into a failure scenario–especially if your paradigm is crash-only failures.

v3n avatar Dec 03 '21 03:12 v3n

To add some extra context, here's some Go psuedo-ish code that illustrations the lifecycle of our services–we have a set of interfaces that are used by the lifecycle manager to gracefully start and shutdown the service, as well as handling final flushes for anything that needs that behavior.

func main() {
	shutdown := make(chan struct{}, 1)

	lifecycle := lifecycle.New(
		Discovery(),
		NewRelicWrapper(newrelic.Application),
		LogPusher(),
		CommandAndControl(shutdown),
		ServiceAPI(),
	)

	lifecycle.Start(context.TODO())
	defer lifecycle.Finalize(context.TODO())

	<-shutdown
	lifecycle.Stop(context.TODO())
}

v3n avatar Dec 03 '21 03:12 v3n

Hi @v3n , thanks for giving us all that context! I'll make sure our Product Manager sees this so they can prioritize this in our product roadmap.

RichVanderwal avatar Dec 14 '21 22:12 RichVanderwal