weather-tools
weather-tools copied to clipboard
`weather-mv`: Memory leak while opening multiple dataset in-memory using `open_dataset()`
Code:
...
if __name__ == '__main__':
print(f"Before execution start: {psutil.Process(os.getpid()).memory_info().rss / 1024 ** 2} MiB")
main(arr)
print(f"After everything is done: {psutil.Process(os.getpid()).memory_info().rss / 1024 ** 2} MiB")
Output:
python -m memory_profiler test_ds_close.py
Before execution start: 269.89453125 MiB
Converting 'gs://XXXX/A1DXXX.bz2' to COGs...
Converting 'gs://XXXX/A2DXXX.bz2' to COGs...
Assuming grib.
Normalizing the grib schema, name of the data variables will look like '<level>_<height>_<attrs['GRIB_stepType']>_<key>'.
opened dataset size: 570909640
Assuming grib.
Normalizing the grib schema, name of the data variables will look like '<level>_<height>_<attrs['GRIB_stepType']>_<key>'.
opened dataset size: 570909640
Line # Mem usage Increment Occurrences Line Contents
=============================================================
283 270.7 MiB 270.7 MiB 1 @profile
284 def main(input):
285
286 270.7 MiB 0.0 MiB 1 asset_name, uris = input
287 270.7 MiB 0.0 MiB 1 uris.sort()
288
289 270.7 MiB 0.0 MiB 1 if len(uris) == 1 and asset_name[1] == '-':
290 asset_name = uris[0]
291
292 270.7 MiB 0.0 MiB 1 print(f'Converting {uris[0]!r} to COGs...')
293 270.7 MiB 0.0 MiB 1 print(f'Converting {uris[1]!r} to COGs...')
294 1568.2 MiB 1297.5 MiB 1 with open_dataset(uris[0]) as ds1, open_dataset(uris[1]) as ds2:
295 1525.8 MiB -42.4 MiB 1 pass
296
297 1525.8 MiB 0.0 MiB 1 ds2.close()
298 1525.8 MiB 0.0 MiB 1 ds1.close()
299 928.7 MiB -597.2 MiB 1 del ds2
300 928.7 MiB 0.0 MiB 1 del ds1
After everything is done: 928.65234375 MiB
Library/Tool Version:
Python - 3.8.13 rasterio - Version: 1.3.0 GDAL - Version: 3.5.1
Please observe that memory for only the 2nd dataset getting cleared.