blue green strategy: wait for app to be really stopped workaround
As described here: https://github.com/cloudfoundry/cloud_controller_ng/issues/3780, in releases anteriors to 1.184.0 of the Cloudfoundry controller, when we instruct Cloudfoundry to stop an app, it immediately reports the state "STOPPED" for the app and shows CPU, memory, etc... metrics to be 0.
In this provider once the app is reported to be "STOPPED" we delete it, so its bindings are deleted and even if the app has still access to the bindings in their env, some services invalidate binding credentials as soon as they are deleted. This causes the app to not be able to finish its work properly.
This workaround is to allow the maximum graceful shutdown time for the apps since we can't trust the "STOPPED" status.
Since it has been fixed since in Cloudfoundry, the workaround is enabled with an option, so people that already have the fix can just ignore it.
Is this still necessary? Since 1.184.0 already fixed this and the reported status should now be "STOPPING" rather than "STOPPED", the wait loop will block until the app status gets updated to STOPPED.
Referring to this wait loop https://github.com/cloudfoundry-community/terraform-provider-cloudfoundry/blob/main/cloudfoundry/managers/v3appdeployers/bluegreen_strategy_v3.go#L118
Another note, the provider does not actually perform the unbind operations on the stopped app. The unbind operation is done by the CF controller itself. The bluegreen_strategy make these calls
- call stop on venerable app
- wait for vernable app's state become "STOPPED"
- call delete on venerable app. (the cf controller handles unbinding the app of all it's route and bindings recursively)
If we are indeed seeing bindings removed before the application graceful period is over, we should report this as this would be a bug. However, I think the wait loop that we have in the provider should be enough to prevent this from happening.
You are right, I just checked on all regions in SAP BTP and the version of Cloudfoundry Controller are at least 1.184.0.
The workaround is no longer necessary as far as I am concerned.
If we are indeed seeing bindings removed before the application graceful period is over, we should report this as this would be a bug. However, I think the wait loop that we have in the provider should be enough to prevent this from happening.
=> Just to clarify the initial issue: Cloudfoundry reported the app as STOPPED as soon as the tf provider sent the stop call, even though the application was still running (since it has a graceful shutdown period), which rendered the "2. wait for venerable app's state become "STOPPED"" step useless. Cloudfoundry Controller was fixed and now supports the "STOPPING" state which is used when the tf provider send the stop call, and the STOPPED state is only used when the app is really stopped.