cf restart [APP] --strategy rolling - getting timeout in spite CF_STARTUP_TIMEOUT env variable set to 15 minutes
- [ X] I reviewed open and closed github issues that may be related to my problem.
- [cf version 8.7.8+515cf4e.2024-02-08] I tried updating to the latest version of the CF CLI to see if it fixed my problem.
- [ X] I attempted to run the command with
CF_TRACE=1to help debug the issue. - [ ?] I am reporting a bug that others will be able to reproduce.
I have an application which has 16 instances. trying to restart the application using strategy rolling fails. i followed the documentation and tried to set the environment variable CF_STARTUP_TIMEOUT with 15 minutes. cf se [APP] CF_STARTUP_TIMEOUT 15 also added to the application manifest.yaml this environment variable.
However the application timeout after 5 minutes which is default. I guess strategy rolling and CF_STARTUP_TIMEOUT are not working together or bug?
What happened The cli gets timeout and the flow breaks.
Expected behavior when setting CF_STARTUP_TIMEOUT to 15 minutes, then the cli waits up to 15 minutes when restart the app using strategy rolling
Exact Steps To Reproduce Steps to reproduce the behavior; include the exact CLI commands and verbose output:
- set application environment variable CF_STARTUP_TIMEOUT 15
- scale up the application to 16 instances, or large number of instances that will cause cf startup to execute longer then 5 minutes
- Run
cf restart --strategy rolling - See error Timed out waiting for application [APP] to start
Provide more context
- platform and shell details - Mac OS X 10.11, iTerm
- version of the CLI you are running - cf version 8.7.8+515cf4e.2024-02-08
- version of the CC API Release you are on - 3.159.0
Thank you @shalomyasap for bringing this to our attention. The issue has been added to our tracker for further investigation.
Hello @shalomyasap ,
The problem could be that the app startup health check is timing out waiting for the app to restart. By default it waits for 1min. May be increasing this might help. This value can be changed using the API. I used the below command
cf curl "/v3/processes/[:guid]" \
-X PATCH \
-H "Content-type: application/json" \
-d '{
"health_check": {
"type": "port",
"data": {
"timeout": 120
}
}
}'
Hi @gururajsh , my project is having the same issue. We have a timeout of 300 seconds defined and I can confirm that an application startup will NOT take longer than 120 seconds. But we have 10 instances defined in the deployment. My Question: Does the timeout apply for all instances together or only for 1 instance? I suspect it applies for all instances in the deplyment and hence the rolling update times out.
I have been able to mitigate the issue by adding the --no-wait flag.
cf restart MY_APP --strategy rolling --no-wait
After that I can observe with
cf app MY_APP
If the rolling update is still in progress or not. Once it shows only 1 deployment witgh the 10 instances and not 2, the rolling update has finished.
Hi @aamotharald ,
Correct, if no-wait flag is not used, it applies for all the instances in the deployment.
Since the issue has been addressed, will close this.