opentelemetry-collector icon indicating copy to clipboard operation
opentelemetry-collector copied to clipboard

[service/internal] Allow components to transition from PermanentError to Stopping

Open mwear opened this issue 6 months ago • 4 comments

Description

In #10058 I mentioned:

There is a tangentially related issue with PermanentErrors and the underlying finite state machine that governs transitions between statuses. Currently, a PermanentError is a final state. That is, once a component enters this state, no further transitions are allowed. In light of the work I did on the alternative health check extension, I believe we should allow a transition from PermanentError to Stopping to consistently prioritize lifecycle events for components. This transition also make sense from a practical perspective. A component in a PermanentError state is one that has been started and is running, although in a likely degraded state. The collector will call shutdown on the component (when the collector is shutting down) and we should allow the status to reflect that.

This PR makes the suggested change and updates the documentation to reflect that. As this is an internal change, I have not included a changelog. Also note, we can close #10058 after this as we've already removed status aggregation from core during the recent component status refactor.

Link to tracking issue

Fixes #10058

Testing

units

Documentation

Updated docs/component-status.md and associated diagram.

mwear avatar Aug 23 '24 17:08 mwear