weather-tools icon indicating copy to clipboard operation
weather-tools copied to clipboard

`weather-mv`: Memory leak while opening multiple dataset in-memory using `open_dataset()`

Open mahrsee1997 opened this issue 2 years ago • 0 comments

Code:

...
if __name__ == '__main__':
    print(f"Before execution start: {psutil.Process(os.getpid()).memory_info().rss / 1024 ** 2} MiB")
    main(arr)
    print(f"After everything is done: {psutil.Process(os.getpid()).memory_info().rss / 1024 ** 2} MiB")

Output:

python -m memory_profiler test_ds_close.py
Before execution start: 269.89453125 MiB
Converting 'gs://XXXX/A1DXXX.bz2' to COGs...
Converting 'gs://XXXX/A2DXXX.bz2' to COGs...
Assuming grib.
Normalizing the grib schema, name of the data variables will look like '<level>_<height>_<attrs['GRIB_stepType']>_<key>'.
opened dataset size: 570909640
Assuming grib.
Normalizing the grib schema, name of the data variables will look like '<level>_<height>_<attrs['GRIB_stepType']>_<key>'.
opened dataset size: 570909640
Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
   283    270.7 MiB    270.7 MiB           1   @profile
   284                                         def main(input):
   285                                             
   286    270.7 MiB      0.0 MiB           1       asset_name, uris = input
   287    270.7 MiB      0.0 MiB           1       uris.sort()
   288                                         
   289    270.7 MiB      0.0 MiB           1       if len(uris) == 1 and asset_name[1] == '-':
   290                                                 asset_name = uris[0]
   291                                         
   292    270.7 MiB      0.0 MiB           1       print(f'Converting {uris[0]!r} to COGs...')
   293    270.7 MiB      0.0 MiB           1       print(f'Converting {uris[1]!r} to COGs...')
   294   1568.2 MiB   1297.5 MiB           1       with open_dataset(uris[0]) as ds1, open_dataset(uris[1]) as ds2:
   295   1525.8 MiB    -42.4 MiB           1           pass
   296                                         
   297   1525.8 MiB      0.0 MiB           1       ds2.close()
   298   1525.8 MiB      0.0 MiB           1       ds1.close()
   299    928.7 MiB   -597.2 MiB           1       del ds2
   300    928.7 MiB      0.0 MiB           1       del ds1


After everything is done: 928.65234375 MiB

Library/Tool Version:

Python - 3.8.13 rasterio - Version: 1.3.0 GDAL - Version: 3.5.1

Please observe that memory for only the 2nd dataset getting cleared.

mahrsee1997 avatar Oct 11 '22 12:10 mahrsee1997