mirrorbits Accessing a file that isn't mirrored results in wrong size/mtime being saved

trafficstars

Start mirrorbits
Add new file to repo
Access file via ?mirrorlist (only fallbacks are listed)
File has size 0 and mtime 0 but shouldn't

Jun 15 '21 19:06 lazka

In general there is some problem with things getting cached and only "fixing themselves" after some minutes.

Jul 04 '21 08:07 lazka

We bumped into the same issue but after 12h the situation was still the same.

Triggering a mirrorbits refresh didn't make a difference.
Restarting the service fixed file sizes and hashes.

Jan 24 '23 17:01 zen-fu

I also bumped into this issue. It seems to be a bug with the LRU cache (mirrors/cache.go)

Mirrorbits uses an internal LRU cache in front of the Redis database: when it needs to request information about a file, it first queries the LRU cache, and if ever the result is in there, it doesn't query the database. Of course, it means that when file information are updated (ie. a scan returns, and updated values are committed to the database), entries in the LRU cache must be invalidated.

This part of the code is problematic:

https://github.com/etix/mirrorbits/blob/0d00d9ed9693a27df22bda108fa1608de59ed31b/database/pubsub.go#L141-L155

The "listener" here is in fact the LRU cache, and the content of the message is a file path. When a scan returns, every file is updated in the database, generating a message in the channel for each file. So it means this function handleMessage is called many times, as many times as there's files in the repo, every time a scan returns.

However, as is clearly stated in the code snippet above: when the channel is full, mirrorbits doesn't wait for the listeners to consume messages in the channel. Instead it just drops the message. I did a quick test, and around 7% of the messages were dropped. If messages are dropped, the LRU cache is never notified that some files were updated, and so outdated entries are not discarded as they should.

I suppose that, for a busy instance, the issue can easily go unnoticed, as entries in the LRU cache will be evicted anyway (LRU = Least Recently Used). And if ever they are not evicted, the consequence is just that mirrorbits might redirect to fallback, instead of redirecting to valid mirrors.

But for a mirrorbits instance that doesn't receive much traffic, outdated entries can stay there forever, and the issue is very visible. As @zen-fu noted, a restart is enough to fix it (since the LRU cache doesn't survive a restart).

So the easy fix is just to make mirrorbits wait when the channel is full. I'm testing it, it seems to work so far.

Dec 08 '23 11:12 elboulangero

This issue can be closed now

Feb 03 '24 14:02 elboulangero

mirrorbits mirrorbits copied to clipboard

Accessing a file that isn't mirrored results in wrong size/mtime being saved

mirrorbits
mirrorbits copied to clipboard