addons icon indicating copy to clipboard operation
addons copied to clipboard

Files disappearing on disk in an unexpected manner

Open willdurand opened this issue 4 years ago • 7 comments

Note: I don't know what's going on yet.


While debugging some git-extraction problems, I noticed a version without a file on disk. The DB is inconsistent and still thinks the file exists, yet the file is not present on disk in any of the possible locations for version files (there are 2 possible locations). In addition to causing git-extraction issues, there are code-search problems as well, see: https://sentry.prod.mozaws.net/operations/olympia-prod/issues/10690523/

We thought it was a one-time event but digging into this issue revealed more issues like this, i.e. this isn't just about a single file/version but many. It's worth noting that these files are not old (< 18 months).

This problem occurs for signed and approved versions and we have git commits (for code-manager) before and after signing. This means we had valid XPIs on disk at some point but they disappeared later...

AFAIK, except for the test suite, we do not have code to actually delete files on disk.

┆Issue is synchronized with this Jira Task

willdurand avatar Sep 10 '21 07:09 willdurand

More than a dozen files disappeared recently.

wagnerand avatar Oct 12 '21 10:10 wagnerand

There were some sentry errors related to CRON tasks (failing to move files), a theory to explore is that some CRON tasks overlap and cause issues?

willdurand avatar Nov 03 '21 15:11 willdurand

@diox unfortunately, I don't have much bandwidth to investigate this issue. Would you be able to take a look, please?

willdurand avatar Nov 03 '21 15:11 willdurand

It doesn't look like it's related to the crons.

Going through the logs I built a list of ~~23~~ 24 files that are missing, created between 2016 and 2020 with the majority created in 2019 (didn't find any from 2021, 2018 or 2017 - there is only one from 2020). Most of them were auto-approved and never changed status since, and were still marked as public. One even belongs to a public listed version. None belongs to a deleted version or deleted add-on (or force-disabled or blocked add-on, for that matter).

The few that are disabled were rejected, but also never approved in the first place - they are from 2016, from before auto-approval.

diox avatar Nov 08 '21 15:11 diox

Took a fresh look at this.

Few interesting thing to note:

  • Looking through all sentry events there doesn't seem to be any new occurrences since last time - so this is only affecting a limited subset of files
  • It has only happened once since we made unlisted submissions go through auto approval in November 2019 (and the few listed versions it happened on are the oldest).
  • For the one version that it happened on in 2020, it seems we do not have the post-signing version in git. For the few others I've checked, we do, but notably we changed the way we extract in June 2019 to only run the task once the transaction has been committed.
  • In at least one instance, we had an exception for a missing file in December 2021 that I can download fine today (!)
  • There isn't anything interesting for the affected add-ons/versions in activity logs AFAICT

Next step will be sharing the list with ops to see what files are actually missing today. But I doubt we'll find the exact reason why those files disappeared... I suspect it has to do with transactions and write errors causing a desync, but I don't know exactly.

diox avatar Jan 31 '22 14:01 diox

All files from the list I built from the logs are actually missing. They are unfortunately too old to recover from backups and also too old for backups to tell us when they disappeared.

diox avatar Feb 09 '22 13:02 diox

Old Jira Ticket: https://mozilla-hub.atlassian.net/browse/ADDSRV-57

KevinMind avatar May 03 '24 16:05 KevinMind