gateway icon indicating copy to clipboard operation
gateway copied to clipboard

Delete unused status keys from watchable

Open arkodg opened this issue 1 year ago • 7 comments

Description:

The status keys are only stored and not deleted. Either all the status keys or at least the stale ones should be deleted to free up memory https://github.com/envoyproxy/gateway/blob/cf46fbe776918ad19444e26d637ffcc79676ca23/internal/gatewayapi/runner/runner.go#L112 & https://github.com/envoyproxy/gateway/blob/cf46fbe776918ad19444e26d637ffcc79676ca23/internal/xds/translator/runner/runner.go#L99

arkodg avatar Feb 20 '24 03:02 arkodg

one solution could be deleting the status from watchable right after processing it in https://github.com/envoyproxy/gateway/blob/cf46fbe776918ad19444e26d637ffcc79676ca23/internal/provider/kubernetes/status.go#L26

arkodg avatar Feb 20 '24 03:02 arkodg

At least when we can determine the status update result, we can delete it in subscribeAndUpdateStatus

ShyunnY avatar Feb 21 '24 14:02 ShyunnY

yeah +1 @ShyunnY

arkodg avatar Feb 21 '24 20:02 arkodg

https://github.com/envoyproxy/gateway/blob/1f5df35ba3925575b4969142514932d0de0ed6f0/internal/status/status.go#L76-L102

When status is updated, do we need to obtain the update result to decide whether to delete status keys?

I have two ideas:

  • First: regardless of the status update result, we delete the status key in the watchable. Assuming that the update result fails, there is no need to process it, just wait for the next reconcile. We can delete the status key directly in the subscribeAndUpdateStatus function.
  • Second: We modify the apply method to accept err externally. If the status update result err != nil, we skip the deletion of status keys. If everything is fine (which is what we want to see), then we delete the status keys

ShyunnY avatar Feb 23 '24 08:02 ShyunnY

@ShyunnY 2 options I see are

  • Mark and Sweep in in the publisher Gateway API Layer similar to what is done for XdsIR and InfraIR https://github.com/envoyproxy/gateway/blob/cf46fbe776918ad19444e26d637ffcc79676ca23/internal/gatewayapi/runner/runner.go#L168

    • Pros

      • status cached in watchable will prevent any more publishes to consumer saving CPU
    • Cons

      • status in watchable takes up memory
  • Delete all after processing is done in the subscriber provider layer https://github.com/envoyproxy/gateway/blob/cf46fbe776918ad19444e26d637ffcc79676ca23/internal/provider/kubernetes/status.go#L77

    • Pros

      • Reduces memory
    • Cons

      • provider will receive all publishes, and will perform an extra read on the resource to compare status, that resource should hopefully be cached in controller-runtime, else it will be an I/O op to API server

arkodg avatar Feb 23 '24 20:02 arkodg

In my opinion, I would vote for the first way.

ShyunnY avatar Feb 24 '24 14:02 ShyunnY

1 sounds good

arkodg avatar Feb 24 '24 15:02 arkodg

Hi, can I work on this issue with the first option, unless someone's already dealing with it?

uniglot avatar Mar 04 '24 15:03 uniglot

@uniglot thanks for taking this.

zirain avatar Mar 04 '24 15:03 zirain