datatracker icon indicating copy to clipboard operation
datatracker copied to clipboard

feat: move submission files async with retry

Open jennifer-richards opened this issue 4 months ago • 2 comments

Validating / rendering various formats of a submitted I-D will, in general, happen in a different container (i.e., a celery worker container) than where its files will be moved upon posting as a draft (i.e., the main django container). These processes may not agree on the contents of the filesystem. That's bad news for the move_files_to_repository method, which assumes it only needs to move files it can see on the filesystem before being done with its job.

This is causing #8016, where NFS sync between containers is sometimes taking a few to a few tens of seconds. We might be able to improve that, but in a future where files become blobs in an external store, this issue is going to be worse so patching around this issue is not appealing.

This PR does a couple things. First, it adds modeling to track the files associated with a submission. For now, it remembers the filename, creation time, and whether it was generated. A file that was not generated is assumed to have been uploaded. This allows us to rely on the database to tell us what files need moving, which better guarantees data consistency.

It also moves the chore of relocating files from the staging path to the draft repository into an asynchronous celery task. The task is tolerant of missing files and retries, moving what it can see, until all the expected files have been moved. With the parameters chosen, it should usually finish in 5-15 seconds but will try for up to 2-3 minutes before giving up. (The ranges are approximate because retry jitter is enabled.)

Fixes #8016

jennifer-richards avatar Oct 05 '24 20:10 jennifer-richards