JobTimeoutException Leaves Temp Files
Hitting the rq JobTimeoutException during Xloadering seems to leave tmp files in the tmp directory.
Not even sure if this is possible to fix in Xloader here. I think you need to use rq's push_exc_handler inside of the ckan.cli.jobs.worker method. So doing some debugging and seeing if I can make an implement in Core code to add exception handlers for the jobs worker.
And then it would be a matter of figuring out how to get the temp file path/name into the implemented exception handler in Xloader here.
Sorry for the sporadic comments on this one. More debugging, I found that we could catch JobTimeoutException during the downloading of the file into the temp file, and clear it there.
The issue I am still having is during the process of when the temp file contents are being copied into the database in the loader.py script. This is when, if JobTimeoutException is raised, the temp file remains.
Currently I am debugging all of this with load_table and not load_csv.
Okay yeah we can fix this in Xloader. (e.g. https://github.com/open-data/ckanext-xloader/commit/62ed5a0626a1d402c2a7340304580c5210647853#diff-69e6ff3cab84fe327b715b7b1d65f7cb9660b09e076ec36dbdb5e12ffeebe3f6)
Will make a PR next week if I have time. But we need to catch the rq timeout exception in a couple places and then just close the tmp file.
Let's see if it is fixed with this: https://github.com/ckan/ckanext-xloader/pull/223
I have something similar in our fork and it is working on our staging branch. Generally I have only seen this issue with super large resources.
squashed version: https://github.com/ckan/ckanext-xloader/pull/239 (and added to the changelog file too)