addons
addons copied to clipboard
Files disappearing on disk in an unexpected manner
Note: I don't know what's going on yet.
While debugging some git-extraction problems, I noticed a version without a file on disk. The DB is inconsistent and still thinks the file exists, yet the file is not present on disk in any of the possible locations for version files (there are 2 possible locations). In addition to causing git-extraction issues, there are code-search problems as well, see: https://sentry.prod.mozaws.net/operations/olympia-prod/issues/10690523/
We thought it was a one-time event but digging into this issue revealed more issues like this, i.e. this isn't just about a single file/version but many. It's worth noting that these files are not old (< 18 months).
This problem occurs for signed and approved versions and we have git commits (for code-manager) before and after signing. This means we had valid XPIs on disk at some point but they disappeared later...
AFAIK, except for the test suite, we do not have code to actually delete files on disk.
┆Issue is synchronized with this Jira Task
More than a dozen files disappeared recently.
There were some sentry errors related to CRON tasks (failing to move files), a theory to explore is that some CRON tasks overlap and cause issues?
@diox unfortunately, I don't have much bandwidth to investigate this issue. Would you be able to take a look, please?
It doesn't look like it's related to the crons.
Going through the logs I built a list of ~~23~~ 24 files that are missing, created between 2016 and 2020 with the majority created in 2019 (didn't find any from 2021, 2018 or 2017 - there is only one from 2020). Most of them were auto-approved and never changed status since, and were still marked as public. One even belongs to a public listed version. None belongs to a deleted version or deleted add-on (or force-disabled or blocked add-on, for that matter).
The few that are disabled were rejected, but also never approved in the first place - they are from 2016, from before auto-approval.
Took a fresh look at this.
Few interesting thing to note:
- Looking through all sentry events there doesn't seem to be any new occurrences since last time - so this is only affecting a limited subset of files
- It has only happened once since we made unlisted submissions go through auto approval in November 2019 (and the few listed versions it happened on are the oldest).
- For the one version that it happened on in 2020, it seems we do not have the post-signing version in git. For the few others I've checked, we do, but notably we changed the way we extract in June 2019 to only run the task once the transaction has been committed.
- In at least one instance, we had an exception for a missing file in December 2021 that I can download fine today (!)
- There isn't anything interesting for the affected add-ons/versions in activity logs AFAICT
Next step will be sharing the list with ops to see what files are actually missing today. But I doubt we'll find the exact reason why those files disappeared... I suspect it has to do with transactions and write errors causing a desync, but I don't know exactly.
All files from the list I built from the logs are actually missing. They are unfortunately too old to recover from backups and also too old for backups to tell us when they disappeared.
Old Jira Ticket: https://mozilla-hub.atlassian.net/browse/ADDSRV-57