dataverse
dataverse copied to clipboard
File Upload: Clean up files left in /tmp after successful ingest
Ingest leaves files in /tmp. This is not normally an issue but ideally we clean up after ourselves so that we don't consume space, especially in the case of large uploads of many files. Though /tmp cleaner jobs take care of this in most cases, it would make the task lighter if we clean up when we can.
For normal file ingest, in v4.6.1 we leave files as a result of csv, excel ingest of the form firstpass*.tab
In v4.6.2 we have added support for Swift storage and in addition to the behavior in v4.6.1 for local files, Swift files leave the following: -csv and excel ingest leaves files of form firstpass*.tab -large images leave files of the form tempFileToRescale*.tmp -other ingest files (.sav, .spss, .por) leave files of form tempIngestSourceFile*.tmp
Also see related ticket on documenting general temp file use and recommendations for sys admins, #2848
as of #3767, csv ingest will no longer do this.
@kcondon as of #5089, which was merged in 4.9.3, @qqmyers suggests he has improved how we clean up files left in /tmp.
Working on behalf of TDL, I've found one case for shapefile zips where a successful upload leaves temporary files in the defined temp directory and several ways that user actions to delete or cancel when uploading can cause temp file to remain. I've gone through the code and have changes to submit.
The only cases I'm aware of where 'persistent' temp files can still be created would be 1) where network errors break the communication with the browser and neither a save or cancel is ever received, and 2) some code that writes directly to subdirs under /tmp (eg. some of the R code) which is nominally cleaned up by the operating system.
@mheppler .xlsx files still produce a firstpass* file in /tmp and ingesting RData files produce a data-*.tab in /tmp
tempFiletoRescale*.tmp files are now deleted with PR #9637 .
firstpass*.tab and tempIngestSourceFile*.tmp still left.
To focus on the most important features and bugs, we are closing issues created before 2020 (version 5.0) that are not new feature requests with the label 'Type: Feature'.
If you created this issue and you feel the team should revisit this decision, please reopen the issue and leave a comment.