netcdf4-python icon indicating copy to clipboard operation
netcdf4-python copied to clipboard

Concurrent read segfault

Open ArnaudLevaufre opened this issue 7 years ago • 13 comments

Hello

When openning a netcdf file in read mode more than one time in a ThreadPool results in a segfault. When using a process pool there is no issue and the concurrent read works fine. It can be an issue in a web environment where the server is using threads to handle its clients and two or more clients make a request that need to read the same netcdf file.

I have a simple python script and a dataset that reproduces the segfault: netcdfSegfault.zip

The python script provided in the zip:

import netCDF4
from multiprocessing.pool import ThreadPool

def read_netcdf(path):
    ieast = 1750
    iwest = 1760
    inorth = 1380
    isouth = 1370

    with netCDF4.Dataset(path, 'r') as ncf:
        return ncf['/Depth/ndepths'][:][iwest:ieast, isouth:inorth]


if __name__ == "__main__":
    print("Netcdf4 version: %s" % netCDF4.__version__)
    path = "./max_depth_ndepth_quonops.nc"
    with ThreadPool(2) as p:
        print(p.map(read_netcdf, [path for i in range(2)]))

Will output the following when executed.

Netcdf4 version: 1.4.0
Segmentation fault (core dumped)

ArnaudLevaufre avatar Sep 03 '18 10:09 ArnaudLevaufre

This scripts gives me

Netcdf4 version: 1.4.2
Traceback (most recent call last):
  File "segfault.py", line 17, in <module>
    with ThreadPool(2) as p:
AttributeError: __exit__

with python 2.7.

jswhit avatar Sep 03 '18 12:09 jswhit

May be related to https://github.com/Unidata/netcdf4-python/issues/640

jswhit avatar Sep 03 '18 12:09 jswhit

Works with python3.6 if you replace ncf['/Depth/ndepths'][:][iwest:ieast, isouth:inorth] with ncf['/Depth/ndepths'][iwest:ieast, isouth:inorth]. I think it's a memory issue, not a concurrency issue.

jswhit avatar Sep 03 '18 12:09 jswhit

You need to use a lock when using netCDF4 with multiple threads. Unfortunately, the underlying HDF5 library is not thread safe.

shoyer avatar Sep 03 '18 16:09 shoyer

Yeah, my rule from experience is that if I'm using multiple threads, all access to netcdf4-python needs to be guarded by a lock.

dopplershift avatar Sep 03 '18 21:09 dopplershift

You can build HDF5 thread-safe, but it's not the default https://portal.hdfgroup.org/display/knowledge/Questions+about+thread-safety+and+concurrent+access. You can read hdf5 files concurrently from multiple processes though (so using Pool instead of ThreadPool should be safe).

jswhit avatar Sep 03 '18 22:09 jswhit

Even if HDF5 is compiled with thread safety, the netcdf4 C library is not thread safe.

dopplershift avatar Sep 04 '18 06:09 dopplershift

Is that true even for reading/writing independent netCDF3 files? On Mon, Sep 3, 2018 at 11:00 PM Ryan May [email protected] wrote:

Even if HDF5 is compiled with thread safety, the netcdf4 C library is not thread safe.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Unidata/netcdf4-python/issues/844#issuecomment-418250184, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKS1smq67mmjzQVQSAKzwvVu_h-j57kks5uXhbogaJpZM4WXXA1 .

shoyer avatar Sep 04 '18 06:09 shoyer

Thanks for the answers. I didn't know the HDF5 library was not thread safe (I must admit I didn't look for this information and took it for granted). @jswhit removing the middle [:] works on the example but not reliably with the real implementation in the context of a web server. So for now I will stick to using processes for concurent read.

ArnaudLevaufre avatar Sep 04 '18 09:09 ArnaudLevaufre

There was at least a plan to make the netcdf library thread safe (see https://www.unidata.ucar.edu/blogs/developer/entry/implementing-thread-safe-access-to). @DennisHeimbigner - was that every implemented?

also https://github.com/Unidata/netcdf-c/projects/6

jswhit avatar Sep 04 '18 18:09 jswhit

Not yet. It is waiting on some other library changes. The priority is high, however.

DennisHeimbigner avatar Sep 04 '18 18:09 DennisHeimbigner

Hi everyone @ArnaudLevaufre , did you have experienced similar errors with concurent reads even with processes ? I opened an issue in xarray (issue) since I have strong issues trying to read netcdf files with several processes. Looking for some help here too ... Thks

lanougue avatar Oct 19 '18 17:10 lanougue

Hi. I had no issues with concurrent reads when using processes so I can't help you.

ArnaudLevaufre avatar Oct 19 '18 17:10 ArnaudLevaufre