telluric
telluric copied to clipboard
memory leak in raster_opener?
This is a followup from #262
Implementation save_cloud_optimized
contained the following block:
with self._raster_opener(self.source_file) as r:
nodata = r.nodata
src.save(tf.name, overviews=False)
convert_to_cog(tf.name, dest_url, resampling, blocksize, overview_blocksize, creation_options)
It appears that this leaks memory.
It was suggested there that we should make _raster_opener into a proper context manager and take care of properly releasing resources. Moving discussion here, because #262 can be solved independently, by other means.
continuing discussion from #262 -
Actually, I now see that my comment there was mistaken - it appears that _raster_opener is NOT a context manager, but in fact just a function that returns a rasterio
file object.
So - if rastrio files provide a context manager - it should already close everything when the context is existed. So, if the block quoted in this issue's description still leaks then either:
(a) r.nodata is actually not an int (as I would expect), but some object that keeps a reference to some internal objects, so they do not get deleted
or (b) There is some bug in rasterio, causing the context exit code not to free some resource.
Should investigate further. If it turns out that this is something we cannot push upstream to fix, we may want to wrap the opener in some context manager as suggested in the discussion
about the memory leak, I wouldn't run and blame _raster_opener
because it might be self.source_file
which in some cases creates an in-memory-raster