filesystem_spec icon indicating copy to clipboard operation
filesystem_spec copied to clipboard

AbstractBufferedFile does not appear to close an mmap cache

Open andreasg123 opened this issue 3 years ago • 3 comments

The result of mmap.mmap in MMapCache does not appear to get closed. That may be the reason for this error when repeatedly opening files stored on S3:

  File ".../python3.8/site-packages/fsspec/spec.py", line 1006, in open
    f = self._open(
  File ".../python3.8/site-packages/s3fs/core.py", line 489, in _open
    return S3File(
  File ".../python3.8/site-packages/s3fs/core.py", line 1659, in __init__
    super().__init__(
  File ".../python3.8/site-packages/fsspec/spec.py", line 1341, in __init__
    self.cache = caches[cache_type](
  File ".../python3.8/site-packages/fsspec/caching.py", line 58, in __init__
    self.cache = self._makefile()
  File ".../python3.8/site-packages/fsspec/caching.py", line 80, in _makefile
    return mmap.mmap(fd.fileno(), self.size)
OSError: [Errno 12] Cannot allocate memory

I believe that something like this in AbstractBufferedFile.close would fix it:

if self.mode == "rb":
    if isinstance(self.cache, MMapCache) and self.cache.cache:
        self.cache.cache.close()
    self.cache = None

A better solution may be to give all cache classes a close method that would be called at the indicated place.

andreasg123 avatar Dec 10 '21 20:12 andreasg123

I would have thought that the cache is closed when the references are cleaned up. Can you have a look to see if the reference is being held somewhere? Perhaps the right thing to do would be to add a weakref.finalize to the file fsspec cache instance.

I would be mostly against adding cache-specific logic to spec.py, if it can be helped.

martindurant avatar Dec 10 '21 20:12 martindurant

I just verified with a test program that an mmap should disappear when the reference is gone. I'll see if I can figure out where a reference may be kept.

If you want to make a change, here is a general approach:

class BaseCache(object):
    def close(self):
        pass

class MMapCache(BaseCache):
    def close(self):
        if self.cache:
            self.cache.close()
            self.cache = None

if self.mode == "rb":
    if self.cache:
        self.cache.close()
        self.cache = None

andreasg123 avatar Dec 10 '21 21:12 andreasg123

That does look fairly harmless, but let me know what your investigation turns up

martindurant avatar Dec 10 '21 21:12 martindurant