metaflow
metaflow copied to clipboard
`metaflow.S3` fails to remove its temp dir upon raising `MetaflowS3NotFound` on an NFS mounted filesystem
When metaflow.S3 is used as a context manager and raises MetaflowS3NotFound, the close method attempts to remove the temporary directory.
When this occurs on an NFS mounted filesystem, a .nfs* file is present within the temp directory, presumably because some process retains an open file handle for something in the directory. Consequently, the rmtree call raises an exception that looks something like this:
OSError: [Errno 16] Device or resource busy: '.nfs89219eebf7d2eef900000027'
This exception is ignored and the temp directory is not removed as intended.
The following script illustrates the issue, but only when run on an NFS mounted filesystem:
#!/usr/bin/env python
import glob
import shutil
import boto3
from metaflow import S3
from metaflow.datatools.s3 import MetaflowS3NotFound
from moto import mock_s3
def main():
print(f"Before: {glob.glob('metaflow.s3.*')=}")
with mock_s3():
s3_client = boto3.client("s3")
s3_client.create_bucket(
Bucket="bucket",
CreateBucketConfiguration={"LocationConstraint": "us-west-2"},
)
try:
with S3(s3root="s3://bucket/prefix") as s3:
s3.get("key")
except MetaflowS3NotFound:
print(f"After: {glob.glob('metaflow.s3.*')=}")
# Elicit the exception that is ignored in `metaflow.S3.close`.
shutil.rmtree(s3._tmpdir)
if __name__ == "__main__":
main()