cli cf restart [APP] --strategy rolling - getting timeout in spite CF_STARTUP

[ X] I reviewed open and closed github issues that may be related to my problem.
[cf version 8.7.8+515cf4e.2024-02-08] I tried updating to the latest version of the CF CLI to see if it fixed my problem.
[ X] I attempted to run the command with CF_TRACE=1 to help debug the issue.
[ ?] I am reporting a bug that others will be able to reproduce.

I have an application which has 16 instances. trying to restart the application using strategy rolling fails. i followed the documentation and tried to set the environment variable CF_STARTUP_TIMEOUT with 15 minutes. cf se [APP] CF_STARTUP_TIMEOUT 15 also added to the application manifest.yaml this environment variable.

However the application timeout after 5 minutes which is default. I guess strategy rolling and CF_STARTUP_TIMEOUT are not working together or bug?

What happened The cli gets timeout and the flow breaks.

Expected behavior when setting CF_STARTUP_TIMEOUT to 15 minutes, then the cli waits up to 15 minutes when restart the app using strategy rolling

Exact Steps To Reproduce Steps to reproduce the behavior; include the exact CLI commands and verbose output:

set application environment variable CF_STARTUP_TIMEOUT 15
scale up the application to 16 instances, or large number of instances that will cause cf startup to execute longer then 5 minutes
Run cf restart --strategy rolling
See error Timed out waiting for application [APP] to start

Provide more context

platform and shell details - Mac OS X 10.11, iTerm
version of the CLI you are running - cf version 8.7.8+515cf4e.2024-02-08
version of the CC API Release you are on - 3.159.0

Mar 19 '24 20:03 shalomyasap

Thank you @shalomyasap for bringing this to our attention. The issue has been added to our tracker for further investigation.

Jun 21 '24 17:06 gururajsh

Hello @shalomyasap ,

The problem could be that the app startup health check is timing out waiting for the app to restart. By default it waits for 1min. May be increasing this might help. This value can be changed using the API. I used the below command

cf curl "/v3/processes/[:guid]" \
  -X PATCH \
  -H "Content-type: application/json" \
  -d '{
    "health_check": {
      "type": "port",
      "data": {
        "timeout": 120
      }
    }
  }'

Jul 16 '24 19:07 gururajsh

Hi @gururajsh , my project is having the same issue. We have a timeout of 300 seconds defined and I can confirm that an application startup will NOT take longer than 120 seconds. But we have 10 instances defined in the deployment. My Question: Does the timeout apply for all instances together or only for 1 instance? I suspect it applies for all instances in the deplyment and hence the rolling update times out.

Aug 20 '24 08:08 aamotharald

I have been able to mitigate the issue by adding the --no-wait flag.

cf restart MY_APP --strategy rolling --no-wait

After that I can observe with

cf app MY_APP

If the rolling update is still in progress or not. Once it shows only 1 deployment witgh the 10 instances and not 2, the rolling update has finished.

Aug 20 '24 09:08 aamotharald

Hi @aamotharald ,

Correct, if no-wait flag is not used, it applies for all the instances in the deployment.

Aug 23 '24 16:08 gururajsh

Since the issue has been addressed, will close this.

Aug 23 '24 16:08 gururajsh

cf restart [APP] --strategy rolling - getting timeout in spite CF_STARTUP_TIMEOUT env variable set to 15 minutes