cli icon indicating copy to clipboard operation
cli copied to clipboard

`dapr stop` does not kill additional processes spun up by an app process

Open mukundansundar opened this issue 2 years ago • 15 comments

Version

This is with latest master build of CLI

Expected Behavior

When an application is run in golang using the go run command, and it is started using dapr run -f, it is expected that dapr stop -f will properly interrupt and kill the go run process and associated app process.

Actual Behavior

dapr stop -f stops the go run process, but another process thats started by go run still keeps running eg: /var/folders/8n/vhq7f8w1419g3t4ww_46_tlr0000gn/T/go-build2324290313/b001/exe/app

Steps to Reproduce the Problem

When the distributed calc application in quickstarts is run using the new dapr run -f template

version: 1
apps:
  - appDirPath: ./go/
    appID: addapp
    appPort: 6000
    daprHTTPPort: 3503
    command: ["go", "run", "app.go"]
  - appID: divideapp
    appDirPath: ./node/
    appPort: 4000
    daprHTTPPort: 3502
    command: ["node", "app.js"]
  - appID: multiplyapp
    appDirPath: ./python/
    appPort: 5001
    daprHTTPPort: 3501
    command: ["flask", "run"]
    env:
      FLASK_RUN_PORT: 5001
  - appID: subtractapp
    appDirPath: ./csharp/
    appPort: 7001
    daprHTTPPort: 3504
    env:
      ASPNETCORE_URLS: 'http://localhost:7001'
    command: ["dotnet", "./bin/Debug/netcoreapp7.0/Subtract.dll"]
  - appID: frontendapp
    appDirPath: ./react-calculator/
    appPort: 8080
    daprHTTPPort: 3507
    command: ["node", "server.js"]

Note : netcoreapp3.1 needs to be changed to netcoreapp7.0 for the existing quickstart to work

In the above scenario, go run app.go starts one process which then starts the app process separately.

When dapr stop -f is called using the run template file, it only kills the go run app.go process and not the binary app process forked from it.

But when a binary is built using say go build -o test-app and that binary is run as ./test-app, dapr stop -f kills the application process.

Have tried, send os.Interrupt, syscall.SIGTERM but that does not work as expected.

In dapr stop -f, kill command with process ID is used to kill the process.

ps output with pid and ppid

ps aj | grep go
user 94976 37431 94975      0    2 S+   s001    0:00.00 grep --color=auto --exclude-dir=.bzr --exclude-dir=CVS --exclude-dir=.git --exclude-dir=.hg --exclude-dir=.svn --exclude-dir=.idea --exclude-dir=.tox go
user 94930 94924 94924      0    1 S+   s005    0:00.21 go run app.go
user 94952 94930 94924      0    1 S+   s005    0:00.01 /var/folders/8n/vhq7f8w1419g3t4ww_46_tlr0000gn/T/go-build2832959224/b001/exe/app

Note that both go run app.go and ....../exe/app belong to the same process group, 94924

Release Note

RELEASE NOTE:

mukundansundar avatar Jan 26 '23 05:01 mukundansundar

@mukundansundar -Just a suggestion- can you please add a 'ps' output for the different stages of this issue showing PID and parent PID - to help comprehend the issue better.

akhilac1 avatar Jan 26 '23 05:01 akhilac1

@akhilac1 updated

mukundansundar avatar Jan 26 '23 05:01 mukundansundar

This exists in normal dapr stop also

Run dapr run --app-id test -- go run app.go Following are the output of various commands

-> ps aj | grep go
user 97405 37431 97405      0    1 S+   s001    0:00.06 dapr run --app-id test -- go run app.go
user 97416 97405 97405      0    1 S+   s001    0:00.30 go run app.go
user 97437 97416 97405      0    1 S+   s001    0:00.02 /var/folders/8n/vhq7f8w1419g3t4ww_46_tlr0000gn/T/go-build3898998650/b001/exe/app
user 97449 16232 97448      0    2 S+   s005    0:00.00 grep --color=auto --exclude-dir=.bzr --exclude-dir=CVS --exclude-dir=.git --exclude-dir=.hg --exclude-dir=.svn --exclude-dir=.idea --exclude-dir=.tox go
➜ dapr list                              
  APP ID  HTTP PORT  GRPC PORT  APP PORT  COMMAND        AGE  CREATED              DAPRD PID  CLI PID  
  test    54962      54963      0         go run app.go  19s  2023-01-26 11:37.26  97410      97405    
➜  dapr stop test            
✅  app stopped successfully: test
➜   ps aj | grep go
user 97437     1 97405      0    0 S    s001    0:00.02 /var/folders/8n/vhq7f8w1419g3t4ww_46_tlr0000gn/T/go-build3898998650/b001/exe/app
user 97561 16232 97560      0    2 S+   s005    0:00.00 grep --color=auto --exclude-dir=.bzr --exclude-dir=CVS --exclude-dir=.git --exclude-dir=.hg --exclude-dir=.svn --exclude-dir=.idea --exclude-dir=.tox go

mukundansundar avatar Jan 26 '23 06:01 mukundansundar

@mukundansundar Right now we use kill command with the process PID.. which does not stops the grand children(/var/folders/8n/vhq7f8w1419g3t4ww_46_tlr0000gn/T/go-build3898998650/b001/exe/app in our case). We can use kill with process group PID to stop all children.. This will behave similar to ctrl +c.

Screenshot 2023-01-27 at 2 08 56 PM

pravinpushkar avatar Jan 27 '23 08:01 pravinpushkar

@mukundansundar Right now we use kill command with the process PID.. which does not stops the grand children(/var/folders/8n/vhq7f8w1419g3t4ww_46_tlr0000gn/T/go-build3898998650/b001/exe/app in our case). We can use kill with process group PID to stop all children.. This will behave similar to ctrl +c.

Screenshot 2023-01-27 at 2 08 56 PM

This change should also be fine with the normal dapr stop but few tests were started failing when I made the change for normal dapr stop. Need to investigate for normal dapr stop.

pravinpushkar avatar Jan 27 '23 11:01 pravinpushkar

For normal stop also it should be the same thing ... We need to be able to make changes for Windows also. In Windows we use taskkill which also has a similar parameter to kill the process tree.

mukundansundar avatar Jan 27 '23 11:01 mukundansundar

For normal stop also it should be the same thing

yes, it should be same. But the tests failed consistently on local and github runner. Something to do with tests can also be a possibility.

pravinpushkar avatar Jan 27 '23 11:01 pravinpushkar

I figured out the failure scenario. Nothing to do with the logic but since in the tests processes are started by make file. So, the process group contains every process including the tests starter script. So it kills the whole tests. Need to create a new process group whenever we start the dapr run -f or normal dapr run.

Have made the changes only for dapr run -f for now in this PR - https://github.com/dapr/cli/pull/1181 We will check separately for normal dapr stop as that have to be fixed for windows also.

pravinpushkar avatar Jan 28 '23 07:01 pravinpushkar

partially fixed for the dapr run -f feature in PR #1181

mukundansundar avatar Jan 31 '23 19:01 mukundansundar

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged (pinned, good first issue, help wanted or triaged/resolved) or other activity occurs. Thank you for your contributions.

dapr-bot avatar Mar 07 '23 18:03 dapr-bot

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged (pinned, good first issue, help wanted or triaged/resolved) or other activity occurs. Thank you for your contributions.

dapr-bot avatar Apr 13 '23 16:04 dapr-bot

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as pinned, good first issue, help wanted or triaged/resolved. Thank you for your contributions.

dapr-bot avatar Apr 27 '23 11:04 dapr-bot

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged (pinned, good first issue, help wanted or triaged/resolved) or other activity occurs. Thank you for your contributions.

dapr-bot avatar May 27 '23 12:05 dapr-bot

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as pinned, good first issue, help wanted or triaged/resolved. Thank you for your contributions.

dapr-bot avatar Jun 03 '23 12:06 dapr-bot