nlp-recipes icon indicating copy to clipboard operation
nlp-recipes copied to clipboard

[ASK] [tc_multi_languages_transformers.ipynb] temporary data directory and cache directory are not deleted after the notebook run

Open daden-ms opened this issue 5 years ago • 9 comments

Description

The temporary directories should be deleted after the notebook run is finished.

Other Comments

consider https://security.openstack.org/guidelines/dg_using-temporary-files-securely.html

daden-ms avatar Nov 25 '19 21:11 daden-ms

@hlums @daden-ms is this related to https://github.com/microsoft/nlp-recipes/issues/487?

miguelgfierro avatar Nov 26 '19 11:11 miguelgfierro

yes, but adding the temp directory where the dataset is downloaded to.

daden-ms avatar Nov 26 '19 16:11 daden-ms

I thought #487 is about fixing the temp directory usage in the tests. This is a notebook and it's already using a TemporaryDirectory for both data loading and SequenceClassifier initialization. So it's strange the directory wasn't removed after notebook run. @kehuangms can you take a look since you are the author of the notebook?

hlums avatar Nov 26 '19 22:11 hlums

TemporaryDirectory will not be deleted if it's not used with a context manager or if it's not deleted explicitly

daden-ms avatar Nov 27 '19 02:11 daden-ms

So it's strange the directory wasn't removed after notebook run

This line https://github.com/microsoft/nlp-recipes/blob/master/tests/conftest.py#L104 should erase the directory after each test run, is that not happening?

miguelgfierro avatar Nov 27 '19 09:11 miguelgfierro

@miguelgfierro This is not related to the tests. This is just about running the notebook locally. I think Daisy meant the TemporaryDirectory needs to be explicitly deleted at the end of the notebook.

hlums avatar Nov 27 '19 14:11 hlums

The notebook uses tempfile package and uses the TemporaryDirectory high level interface to create the temp folder. According to the document, it "provide automatic cleanup and can be used as context managers" so no need to explicitly delete the directory. Of course, the directory can be explicitly cleaned up by calling the cleanup() method. But I would prefer the system to clean it since that's the whole idea of using the temp directory.

Details please refer to the documentation of the TemporaryDirectory interface.

If no one has objection, I would vote to close this as it's not an issue.

kehuangms avatar Dec 05 '19 16:12 kehuangms

@kehuangms Did you verify the cleanup happened as expected? I think the reason @daden-ms created the issues is the cleanup didn't happen.

hlums avatar Dec 05 '19 16:12 hlums

if I'm not mistaken the folder is deleted after 3 days. @loomlike worked on this so he might have more info

miguelgfierro avatar Dec 05 '19 17:12 miguelgfierro