turbo icon indicating copy to clipboard operation
turbo copied to clipboard

Turbo daemon uses 100% CPU even when no tasks are running

Open AdiRishi opened this issue 1 year ago • 18 comments

Verify canary release

  • [X] I verified that the issue exists in the latest Turborepo canary release.

Link to code that reproduces this issue

N.A

What package manager are you using / does the bug impact?

pnpm

What operating system are you using?

Mac

Which canary version will you have in your reproduction?

1.13.4-canary.3

Describe the Bug

I recently noticed that after running a turborepo command in my monorepo, the CPU would stay at 100% even after the command had finished. I saw that the turbo process was still alive and kicking. After a bit of investigation, I realised that I could trigger this behavior by running the turbo daemon. I've included a video of the behavior I see. NOTE I did install the latest canary version of turbo and test with that, same behavior.

I tried to reproduce this on a new repository made with npx create-turbo@latest -e with-shell-commands however that DID NOT reproduce the issue. Running pnpm turbo daemon start in that repository did not cause the CPU to spike with a long lived turbo process.

Given that I was't able to reproduce this in a new repository, I tried thought to review if I was doing something odd in my current repository, the only things I can think off are

  • I have plop config for the turbo gen command at the top level of the repository
  • I use a custom remote cache, so I have the TURBO_TEAM, TURBO_API, TURBO_TOKEN and TURBO_REMOTE_CACHE_SIGNATURE_KEY variables present in my .env file

To Reproduce

https://github.com/vercel/turbo/assets/8351234/a4d74f2d-e1fc-46e0-9bb0-7c8960671bc4

Additional context

The next thing I was going to try was to delete my local copy of the repository and try to re-clone and set it up again to see if the issue persists. However I figured it may be better to make this issue first in case there are specific debugging steps that may reveal the source of the issue.

AdiRishi avatar May 10 '24 04:05 AdiRishi

Hi @AdiRishi, thanks for the issue. Could you share the output of turbo daemon logs? You can also access the log file directly by running turbo daemon status and going to the log file path. If you're not comfortable sharing it here, you can also send it to me at [email protected]

NicholasLYang avatar May 10 '24 14:05 NicholasLYang

Logs seem pretty empty 🙃

https://github.com/vercel/turbo/assets/8351234/07176267-0a81-4d46-b9d4-9adfbeefe47e

AdiRishi avatar May 11 '24 01:05 AdiRishi

Hmm very interesting. How large of a repository are you running inside of? And do you see any logs after a significant amount of time, say 10 minutes?

NicholasLYang avatar May 13 '24 14:05 NicholasLYang

Hmm very interesting. How large of a repository are you running inside of? And do you see any logs after a significant amount of time, say 10 minutes?

I'd say it's a mid-size repository. Around 21 sub-projects in total, it's a mix of around 6 webapps, 7 cloudflare workers and then more utility libaries / configs etc. Nothing crazy.

I'll run some further debugging and get the information you want. I'll also try to continue to re-clone and see if I can reporduce the issues on other systems. I'll get back to you on this.

AdiRishi avatar May 14 '24 07:05 AdiRishi

Gotcha. Any chance you could run the daemon directly by stopping it (turbo daemon stop) then doing turbo daemon -vvv? This will run it in full verbosity. Hopefully that should give a better idea of where the daemon is stalling.

NicholasLYang avatar May 14 '24 18:05 NicholasLYang

Alright, I have some very interesting discoveries to go through.

First off I want to start with the fact that when I re-clone this repository into a different location, and run the daemon from it, this behavior does not occur.

Next I tried to run turbo daemon -vvv on a different turborepo repository which doesn't exhibit this issue. Here was the output, seems fairly normal. The logs stopped after a few seconds. arishi-monorepo-daemon-logs.txt

I then ran turbo daemon -vvv on the problem repository, and the logs wouldn't stop. I've captured around 1 minute of logs in this file. The full logfile is around 25MB so I had to gzip it 😅 bad-monorepo-daemon-logs.txt.gz

I've captured both logfiles using a command like this on mac pnpm turbo daemon -vvv &> daemon-logs.txt.

Root Cause of Bug

Looking through the bad logs I realised there were mentions of a .git folder in workers/turborepo-remote-cache. This was confusing since I didn't think I had git submodules. I went into the directory, and sure enough, there is an inner git repository here with unstaged changes 🙃 . I think around 1 month ago I was updating my local copy of this worker and I accidentally left the git repository cloned and forgot to remove the .git folder. So it seems like having this unstaged change causes whatever the turbo daemon is doing to spin in an infinite loop.

I confirmed this by removing the unstaged changes and deleting the .git folder in the worker/turborepo-remote-cache folder and everything is back to normal 🎉

Still, a very very odd manifestation, definitely does indicate a subtle bug in turbo haha. I'm happy to help with more debugging if it will be helpful to fix the underlying bug :)

AdiRishi avatar May 16 '24 02:05 AdiRishi

Thanks for the very thorough explanation! This should be a pretty easy fix. We currently filter out change events for the root .git folder, but we can probably extend that to be any nested .git folder too.

NicholasLYang avatar May 16 '24 14:05 NicholasLYang

+1 here, I had tubo spawning so many processes that I casually ran into a fork failed: resource temporarily unavailable on my terminal. After running a turbo command the CPU usage would slowly creep up until the max processes count was reached for my system. Screenshot 2024-05-20 at 13 54 27

I was in the process of consolidating my code into a monorepo, and overlooked that there was a nested .git folder remaining. After removing that one turbo seems to not cause this issue anymore.

giorgiogross avatar May 20 '24 12:05 giorgiogross

Related #3455

hmnd avatar May 20 '24 19:05 hmnd

We've just had to disable the daemon in our repo as it was severely harming the performance of pnpm deploy, which copies files and the workspace's runtime dependencies to an isolated directory. I'm guessing it's a similar root cause to this issue.

samhh avatar Aug 19 '24 09:08 samhh

I think this might be caused by this other bug:

https://github.com/vercel/turborepo/issues/8932

Cypher1 avatar Sep 23 '24 04:09 Cypher1

Today I have experienced it again after a long time without issues...needed to kill the process

karfau avatar Sep 24 '24 22:09 karfau

I may also have this issue -- but less so a CPU spike -- but a process spike.

I get

bash: fork: Resource temporarily unavailable

everywhere on my machine, as (on MacOS, at least -- on linux, I've never run in to this), I've hit the process limit (greater than 5300 processes) -- if I force kill all turbo instances in the activity monitor, my system goes down to around 500 processes (I have a lot of browser tabs open)

NullVoxPopuli avatar Oct 29 '24 20:10 NullVoxPopuli

Just experienced this again today... was wondering why my laptop's fans suddenly spun up.

hmnd avatar Oct 30 '24 04:10 hmnd

To me it happens when I install a package while running turbo

Serpentarius13 avatar Nov 09 '24 16:11 Serpentarius13

@NullVoxPopuli Is what you're describing above the same as https://github.com/vercel/turborepo/issues/9455?

anthonyshew avatar Nov 23 '24 20:11 anthonyshew

ye

NullVoxPopuli avatar Nov 23 '24 20:11 NullVoxPopuli

For visibility: Some of what's being reported on this issue has overlap with https://github.com/vercel/turborepo/issues/9455. We've fixed 9455 as of 2.3.4-canary.2, so the folks indicating that they have too many processes being left open are likely to see improvement there.

anthonyshew avatar Dec 06 '24 21:12 anthonyshew