Some keys in sharedTags are never get cleaned up if redis is evicting keys (occured in redis-strings implementation)
Brief Description of the Bug Seems like sharedTag map contains several tags which are non existent anymore. In my case this resultet in a large sharedTag map of 150mb which got scanned on every revalidateTag slowing down redis a lot.
Here some numbers: Keys in db: 15031 Keys in sharedTags file: 184875 Keys in db but not in sharedTags: 270 Keys in sharedTags but not in db: 170114 --> only 10% in shared tags file is still relevant
Severity [Critical/Major/Minor/Cosmetic] Major
Frequency of Occurrence [Always/Sometimes/Rarely] Always
Steps to Reproduce Provide detailed steps to reproduce the behavior, including any specific conditions or configurations where the bug occurs:
1.) Start a redis with very low memory and lfu-volatile key eviction. 2.) produce a lot of cache entries 3.) check sharedTags
Expected vs. Actual Behavior sharedTags got so huge that it blocks all resources of redis during revalidateTag
Environment:
- OS: Linux
- Node.js version: 20
@neshca/cache-handlerversion: ^1.4.0nextversion: 15 rc0
Dependencies and Versions List any relevant dependencies and their versions.
Attempted Solutions or Workarounds Describe any attempted solutions or workarounds and their outcomes.
Impact of the Bug Describe how this bug is affecting your work.
Additional context Provide any other information or context about the problem here.
EDIT: As it seems this plugin is not actively maintained anymore we created a new improved redis cache handler for next.js. It uses redis keyevent notifications to fix this issue. Besides that I implemented a couple of other performance optimizations by eliminating the resource hungry hscan and including some in-memory caching synced via keyspace notification (so it is multi node capable).
You can take a look here: https://github.com/trieb-work/nextjs-turbo-redis-cache Feel free to test or contribute as we are looking for more maintainers.
I can create a PR for fixing it. What solution would you prefer?
- fetching keys in revalidateTag function and extending tagsToDelete with the outdated keys? https://github.com/caching-tools/next-shared-cache/blob/fe5ad0cd7699508195c5f0223cae5920c1305da5/packages/cache-handler/src/handlers/redis-strings.ts#L167
- or starting a interval to do a cleanup routine every few hours
I have made another test with a large DB so keys do not get evicted and the problem exists there as well. Not in such a huge dimension but still some keys in shared tags file are not valid anymore, which will accumulate over time. I think as soon as a key reaches it's TTL value it will get deleted from redis but not cleaned up from shared Tags.
I have now analyzed both solutions. Each will come with it's own drawbacks:
- fetching keys in revalidateTag function and extending tagsToDelete with the outdated keys
Obviously this will make the already slow revalidate tag even slower by adding a keys scan operation. Even if this scan is way faster as the sharedTags scan itself it will still additionally slow it down
- or starting a interval to do a cleanup routine every few hours
This will be hard to implement in serverless environments. I'm not sure what the focus of this package is. I'm using it with a Node-Server. Is there any amount of users using this with serverless environments as well? For server environments we can easily use setInterval for starting a cleanup routine. For serverless environments we would need to use something like unstable_after combined with a propability function to start a cleanup for example on every 500th request.
Hey @tilman
We’re experiencing the exact same issue, significantly affecting our application’s performance by increasing latency and impacting the user experience. As a temporary solution, we’ve implemented a cronjob to clear the Redis database every hour. Initially, we assumed the issue was due to the Redis database size, so we set a default TTL to expire objects quickly, but this had no effect. After extensive troubleshooting, we found that uncleaned sharedtags were the root cause.
Jep, the missing cleanup of expired/evicted keys will grow sharedTags drastically. In our case it was 150Mb after 1 day. After we applied a fix to the redis-strings handler it is only 15Mb after 7 days. Here is the patch we are using right now: https://gist.github.com/tilman/d13271d0e0b8772dcd7d467846a17044
But in general we are working on a better implementation of redis-strings handler right now. Even with the patch it still has no optimal performance because for every revalidateTag call the full scan for sharedTags (+ newly for the fix also the keys) is performed.
The new handler will have the following features:
- in memory sharedTags file. Reducing redis load a lot by
- with it's changes synced with redis pub/sub across all running containers
- periodic resync of sharedTags to prevent drift
- removal of expired/evicted keys from sharedTags using redis keyspace notifications
Let me know if you are interested in it, then I can try to bring this open source :)
Jep, the missing cleanup of expired/evicted keys will grow sharedTags drastically. In our case it was 150Mb after 1 day. After we applied a fix to the redis-strings handler it is only 15Mb after 7 days. Here is the patch we are using right now: https://gist.github.com/tilman/d13271d0e0b8772dcd7d467846a17044
But in general we are working on a better implementation of redis-strings handler right now. Even with the patch it still has no optimal performance because for every revalidateTag call the full scan for sharedTags (+ newly for the fix also the keys) is performed.
The new handler will have the following features:
- in memory sharedTags file. Reducing redis load a lot by
- with it's changes synced with redis pub/sub across all running containers
- periodic resync of sharedTags to prevent drift
- removal of expired/evicted keys from sharedTags using redis keyspace notifications
Let me know if you are interested in it, then I can try to bring this open source :)
@tilman Hi, i was having the same issue as you mentioned, but didn't figure out how to apply the patch you made, i updated the lib but the latency keep going up and the sharedTags keep growing.
I noticed that if a registry is created with for example key="abc..." it is created on sharedTags the reference and after it is cleaned and i make the same request again, the same key is created but it don't create a new key on sharedTags, it just reuse the same reference there. Is this somehow related?
Lastly, if is not a problem to you, could you explain how i can apply the fix you made that decreased the storage drastically?
I had the same issue.
I set up cron job to clean the sharedTags every hour, the script iterates over all the keys of sharedTags and, if the corresponding entry doesn't exist, push the key into an array to delete them in batches later.
I had the same issue.
I set up cron job to clean the
sharedTagsevery hour, the script iterates over all the keys ofsharedTagsand, if the corresponding entry doesn't exist, push the key into an array to delete them in batches later.
@slackerzz could you please share the code of that cron job? I have the feeling I'm having the same problem.
@fbudassi here it is:
const BATCH_SIZE = 1000
const KEY_PREFIX = 'JSON:'
const SHARED_TAGS_KEY = '__sharedTags__'
async function batchDelete(client, key, keys) {
const totalBatches = Math.ceil(keys.length / BATCH_SIZE)
for (let i = 0; i < totalBatches; i++) {
const batch = keys.slice(i * BATCH_SIZE, (i + 1) * BATCH_SIZE)
await client.hDel(key, batch)
}
}
async function cleanRedisCache() {
let client
client = buildRedisClient()
if (client) {
try {
await client.connect()
} catch (error) {
client
.disconnect()
.then(() => {
logger.info('Redis disconnected')
})
}
}
const sharedTagsKey = `${KEY_PREFIX}${SHARED_TAGS_KEY}`
const keys = await client.hKeys(sharedTagsKey)
const fieldsToDelete = []
for (const key of keys) {
const fullKey = `${KEY_PREFIX}${key}`
const relatedKey = await client.exists(fullKey)
if (!relatedKey) {
fieldsToDelete.push(key)
}
}
if (fieldsToDelete.length > 0) {
const length = await client.hLen(sharedTagsKey)
logger.info({ totalFields: length, fieldsToDelete: fieldsToDelete.length }, 'Fields to delete')
await batchDelete(client, sharedTagsKey, fieldsToDelete)
} else {
logger.info('No fields to delete')
}
await client.quit()
process.exit(0)
}
try {
await cleanRedisCache()
process.exit(0)
} catch (err) {
logger.error(err, 'Error cleaning Redis cache')
process.exit(1)
}
I use pino for logging and the buildRedisClient function simply returns a Redis client created calling createClient from redis npm package.
With Next I use ISR (without generating the pages at build time) and then rely on the automatic revalidation setting a revalidate value.
The issue here are the never expiring keys inside the hash. If you don't manually revalidate your cache/paths, these keys will grow until they take all the available memory.
Redis 7.4 added expiration of individual hash fields, but it's not available in Valkey (yet?).
@slackerzz thank you very much! this really helps!
Not sure if you considered this option as well: https://redis.io/docs/latest/develop/use/keyspace-notifications/ TL;DR: Redis can notify of events, like expired/evicted keys events, so instead of a cronjob, you could get notified by Redis when a key expires or is evicted so as to remove it from the sharedTags.
The notifications are disabled by default and need to be enabled with this setting in the redis.conf:
notify-keyspace-events Ex
or with this command line (though this is temporary and won't be kept if redis is restarted):
redis-cli config set notify-keyspace-events Ex
Then with something like this you can subscribe to the events and react to them:
import Redis from 'ioredis';
const redis = new Redis();
const sub = new Redis();
sub.subscribe('__keyevent@0__:expired', (err, count) => {
if (err) throw err;
console.log('Subscribed to expired events');
});
sub.on('message', async (channel, key) => {
console.log(`Key expired: ${key}`);
// Example: Update another key when this one expires
if (key.startsWith('session:')) {
const userId = key.split(':')[1];
await redis.set(`user:${userId}:status`, 'offline');
}
});
(short snippet thanks to chatgpt ;-))
Hey all,
I created a new redis cache handler to solve exactly this problem.
It uses exactly these redis keyevent notifications to fix this issue. Besides that I implemented a couple of other performance optimizations by eleminating the resource hungry hscan and including some in-memory caching synced via keyspace notification (so it is multi node capable).
You can take a look here: https://github.com/trieb-work/nextjs-turbo-redis-cache
@tilman thank you! I'm having a look at it right now. Do you think it'll work with Next.js 14.2.29? (full app router, btw)
I'm answering myself here, after reading the README.md of the project:
It is not compatible with Next.js 14.x. or 15-canary or if you are using Pages Router
Anyways, a good reason to finish my partially finished upgrade to next 15, mainly stopped because of @neshca ;-)
@fbudassi yeah, unfortunately just nextjs 15 support to keep it as clean and small as possible.. the next team changed so much from 14 to 15 in the cache handling, really annoying 😁 but this implementation now works really well! Keeping it all clean and fast by supporting the keyspace notifications from redis and in memory cache per instance to reduce redis cache calls.