dataverse icon indicating copy to clipboard operation
dataverse copied to clipboard

File Upload: Clean up files left in /tmp after successful ingest

Open kcondon opened this issue 8 years ago • 4 comments

Ingest leaves files in /tmp. This is not normally an issue but ideally we clean up after ourselves so that we don't consume space, especially in the case of large uploads of many files. Though /tmp cleaner jobs take care of this in most cases, it would make the task lighter if we clean up when we can.

For normal file ingest, in v4.6.1 we leave files as a result of csv, excel ingest of the form firstpass*.tab

In v4.6.2 we have added support for Swift storage and in addition to the behavior in v4.6.1 for local files, Swift files leave the following: -csv and excel ingest leaves files of form firstpass*.tab -large images leave files of the form tempFileToRescale*.tmp -other ingest files (.sav, .spss, .por) leave files of form tempIngestSourceFile*.tmp

Also see related ticket on documenting general temp file use and recommendations for sys admins, #2848

kcondon avatar May 03 '17 19:05 kcondon

as of #3767, csv ingest will no longer do this.

oscardssmith avatar Jul 07 '17 18:07 oscardssmith

@kcondon as of #5089, which was merged in 4.9.3, @qqmyers suggests he has improved how we clean up files left in /tmp.

Working on behalf of TDL, I've found one case for shapefile zips where a successful upload leaves temporary files in the defined temp directory and several ways that user actions to delete or cancel when uploading can cause temp file to remain. I've gone through the code and have changes to submit.

The only cases I'm aware of where 'persistent' temp files can still be created would be 1) where network errors break the communication with the browser and neither a save or cancel is ever received, and 2) some code that writes directly to subdirs under /tmp (eg. some of the R code) which is nominally cleaned up by the operating system.

mheppler avatar Jun 28 '19 18:06 mheppler

@mheppler .xlsx files still produce a firstpass* file in /tmp and ingesting RData files produce a data-*.tab in /tmp

kcondon avatar Jul 16 '19 18:07 kcondon

tempFiletoRescale*.tmp files are now deleted with PR #9637 .

firstpass*.tab and tempIngestSourceFile*.tmp still left.

haarli avatar Mar 28 '24 14:03 haarli

To focus on the most important features and bugs, we are closing issues created before 2020 (version 5.0) that are not new feature requests with the label 'Type: Feature'.

If you created this issue and you feel the team should revisit this decision, please reopen the issue and leave a comment.

cmbz avatar Aug 20 '24 15:08 cmbz