iris netcdf save: thread lock bottleneck

netcdf save: thread lock bottleneck

Open bblay opened this issue 7 months ago • 2 comments

Saving lazy-loaded cubes to netcdf causes a bottleneck in thread lock acquisition.

Workaround: Force cubes to load data before saving.
Saving with zlib makes it worse.
File format seems to make a big difference - grib almost 25x slower.

Here's a summary from a simple performance test, which loads and saves from a variety of file formats. Results are shown for each file type, as we turn forced loading and zlib on and off.

Click for results summary

force_load True, zlib False 1x ff (400M) 16.232053 1x nc (1.8G) 15.842497 1x pp (680M) 12.208700 62x grib (50M) 66.493797

force_load True, zlib True 1x ff (400M) 23.958872 1x nc (1.8G) 35.287851 1x pp (680M) 44.178169 62x grib (50M) 75.444709

force_load False, zlib False 1x ff (400M) 47.509229 1x nc (1.8G) 18.348945 1x pp (680M) 24.172450 62x grib (50M) 821.598061

force_load False, zlib True 1x ff (400M) 98.374770 1x nc (1.8G) 33.127275 1x pp (680M) 185.618109 62x grib (50M) 1559.963509

Note: There are 62 grib files with a much smaller total data size.

Here's the profile stats, with and without forced loading:

Click for profile stats

Without forced loading: no_force__no_zlib

With forced loading: force__no_zlib

Here's the code - couldn't make it collapsible, sorry!

import cProfile
import warnings
from time import perf_counter

import iris


files = {
    '1x ff (400M)': 'OMITTED',
    '1x nc (1.8G)': 'OMITTED.nc',
    '1x pp (680M)': 'OMITTED.pp',
    '62x grib (50M)': 'OMITTED/*.GRIB',
}


class Timer(object):
    """
    A simple timing context manager.
    """
    def __init__(self):
        self.start = None
        self.taken = None

    def __enter__(self):
        self.start = perf_counter()
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        assert self.start is not None
        self.taken = perf_counter() - self.start


def performance_test(files, force_load, zlib):
    results = {}
    for label, file in files.items():
        results[label] = process_file(file, force_load=force_load, zlib=zlib)

    print(f'\nforce_load {force_load}, zlib {zlib}')
    for name, time in sorted(results.items()):
        print(f'{name} {time:02f}')


def process_file(file, force_load, zlib):
    with Timer() as timer:
        cubes = iris.load(file)
        if force_load:
            for cube in cubes:
                assert cube.data.shape
        iris.save(cubes, '/net/home/h02/bblay/lfric_agfgregate/out.nc', zlib=zlib)

    return timer.taken


if __name__ == '__main__':
    warnings.filterwarnings(action='ignore', message='Ignoring a datum in netCDF load')

    for force_load in [True, False]:
        for zlib in [False, True]:
            with cProfile.Profile() as profiler:
                performance_test(files=files, force_load=force_load, zlib=zlib)

                fname = 'force' if force_load else 'no_force'
                fname += '__zlib' if zlib else '__no_zlib'
                profiler.dump_stats(f'/net/home/h02/bblay/lfric_agfgregate/{fname}.pstat')

Jul 09 '24 11:07 bblay

iris iris copied to clipboard

netcdf save: thread lock bottleneck

iris
iris copied to clipboard