turbo
turbo copied to clipboard
Cache keeps growing indefinitely
What version of Turborepo are you using?
1.1.6
What package manager are you using / does the bug impact?
pnpm
What operating system are you using?
Linux
Describe the Bug
Using turborepo in a relatively big and active monorepo. Saving and restoring turborepo's cache in CI is quickly becoming time-consuming. The cache grows to some hundreds of MBs in a matter of few days/weeks.
Expected Behavior
turborepo should cleanup old cache objects, either automatically or via a given flag (i.e. --refresh-cache
). So the cache wouldn't grow unbounded.
To Reproduce
Just keep using turborepo saving/restoring the same cache over few days, given an active repo. Notice how the cache size grows.
We've started noticing this too, restoring and saving an updated cache of around 1.1 GB easily takes over a minute on GitHub Actions. Alternatively, a CLI command to evict items older than x would be desirable, to keep cache sizes manageable.
Any news here? We are unfortunately experiencing the same issue and we can't use Vercels caching options due to internal guidelines...
If you are using GitHub Actions, I’m building this action to solve this problem.
https://github.com/dtinth/setup-github-actions-caching-for-turbo
Instead of reading/writing cache from the filesystem and using a separate step (e.g. actions/cache
) to save/restore this filesystem state, this action configures Turborepo to read from/write to GitHub Actions Cache Service API. This allows for fine-grained caching and avoids the problem where cache grows indifinitely.
If you are not using GitHub Actions, you deploy some open-source solution to your infrastructure:
- https://github.com/ducktors/turborepo-remote-cache (uses local filesystem / S3 / GCS / Azure Blob)
- https://github.com/cometkim/turbocache (serverless solution that uses Cloudflare Workers and KV)
What about local development? We have people here getting 300GB of cache on their local turbo repo cache
My current local folder ( I only checked out the code fresh 2 weeks ago) - this is madness !
15G ./node_modules/.cache/turbo
We have same problem on CI. Getting and restoring cache was continuously getting bigger and CI runs were terribly getting slower.
find ./node_modules/.cache/turbo -mtime +7 -exec rm {} +
Temporarily, I run script that gets files older than one week and remove it.
I've also had this concern while using Turbo's caching. My temporary solution to this is using a modified stackoverflow answer for deleting files based on creation time in Javascript.
Command
node delete-old.mjs node_modules/.cache/turbo 604800000
604800000 = 7 days in ms
Script delete-old.mjs
// Modified from https://stackoverflow.com/a/23022459
import fs from 'fs';
import { fileURLToPath } from 'url';
import path from 'path';
import { rimraf } from 'rimraf';
const __filename = fileURLToPath(import.meta.url); // get the resolved path to the file
const __dirname = path.dirname(__filename); // get the name of the directory
// e.g. node delete-old.mjs directory
const directory = path.join(__dirname, '..', process.argv[2]);
// e.g. node delete-old.mjs directory 604800000
/**
* Expiry time in milliseconds.
*/
const expiryTime = Number(process.argv[3]) || 604800000;
const dateFormatOptions = {
weekday: 'long',
year: 'numeric',
month: 'long',
day: 'numeric',
hour: 'numeric',
minute: 'numeric',
};
fs.readdir(directory, (err, files) => {
console.log(`Checking for files older than ${getColouredText(expiryTime + ' ms', 31)}...\n`);
if (err) {
return console.error(err);
}
if (!files.length) {
console.log(getColouredText(`Directory empty.`));
}
files.forEach((file, index) => {
fs.stat(path.join(directory, file), (err, stat) => {
let endTime, now;
if (err) {
return console.error(err);
}
now = new Date().getTime();
endTime = new Date(stat.ctime).getTime() + expiryTime;
if (now > endTime) {
return rimraf(path.join(directory, file))
.then(() => {
console.log(`Successfully deleted expired file ${getColouredText(file, 32)}`);
console.log(
`Created Date: ${getColouredText(stat.ctime.toLocaleString('en-CA', dateFormatOptions), 33)}\n`
);
})
.catch(err => {
console.error(err);
});
}
});
});
});
/**
* Available colours: https://en.wikipedia.org/wiki/ANSI_escape_code#Colors
*
* @param {*} text
* @param {*} colourCode
* @returns
*/
function getColouredText(text, colourCode = 33) {
return `\x1b[${colourCode}m${text}\x1b[0m`;
}
If you want to keep an exact amount of caches, you can use my example:
export CACHES_TO_KEEP=2
ls -At -1 -d "$PWD/node_modules/.cache/turbo/"* | tail -n "+$(($CACHES_TO_KEEP*2+1))" | xargs -r rm
Explanation:
-
CACHES_TO_KEEP
is the amount of caches that should remain -
ls -At -d "$PWD/node_modules/.cache/turbo/"*
lists all cache files (absolute paths) sorted by time -
tail -n "+$(($CACHES_TO_KEEP*2+1))"
: removes lines of the most recent caches-
*2
: it is multiplied by 2, as one cache always has 2 files (*.tar.zst
and*-meta.json
) -
+1
: is just an offset that is needed fortail
-
-
xargs rm
actually removes the files
I hope that this will be implemented soon, to avoid these workarounds.
GitHub action
# .github/workflows/actions/remove-outdated-turbo-cache/action.yml`
name: Remove outdated turbo cache
description: "Removes outdated caches. This workaround can be removed, when this issue is resolved: https://github.com/vercel/turbo/issues/863"
inputs:
caches-to-keep:
description: "Keeps the defined amount of the most recent caches (default: 10). All other caches will be removed"
default: "10"
runs:
using: "composite"
steps:
- name: Remove old turbo cache
shell: bash
run: |
outdated_files=$(ls -At -1 -d "$PWD/node_modules/.cache/turbo/"*)
outdated_files_amount=$(echo "$outdated_files" | wc -l | awk '{$1=$1};1')
outdated_files_size=$(du -ch $outdated_files | tail -1 | cut -f 1)
echo "Removing $outdated_files_amount outdated cache files ($outdated_files_size):"
echo "$outdated_files"
echo "$outdated_files" | tail -n "+$((${{ inputs.caches-to-keep }}*2+1))" | xargs -r rm
# Usage as a workflow step:
# ...
- name: Remove outdated turbo cache
uses: ./.github/workflows/actions/remove-outdated-turbo-cache
with:
caches-to-keep: 25
# ...