Matthew Lennie

Results 9 comments of Matthew Lennie

> Minimal repro: > > ``` > import xarray as xr > ds = xr.open_mfdataset('gfs.0p25.201511*00.f0*.grib2', engine='cfgrib', combine='nested', concat_dim=['step'], parallel=True, chunks=24, backend_kwargs={'filter_by_keys': {'typeOfLevel': 'surface'}, 'indexpath': ''}) > ``` > > Expected...

That's interesting. Do you happen to have a theory of why this error would appear in parallel but not in serial? On 06.10.2020 09:09, Guido Cioni wrote: > I can...

Could it be that eccodes isn't thread safe some how? It seems that when manually open multiple files using CFgrib.open_datasets via multiple processes I don't get the error. I do...

Thanks for the information. I am stumped then, can you think of another reason why I (and others) would see this behavior? m On 08.10.2020 14:47, shahramn wrote: > The...

Awesome. Thanks for looking into it. Not all heroes wear capes :) On 08.10.2020 15:01, shahramn wrote: > Looks like the conda recipe does NOT enable the thread safety flags....

I can also confirm. I just ran a test using delays = [] for file in files: delays.append(dask.delay(cfgrib.opendatasets(file), backend_kwargs={"indexpath":""})) client.persist(delays) It previous resulted in killed workers as described. Now the...

Hi @alexamici I would just like to add a data point regarding performance. I am trying to open a 320mb file on a HPC system where bandwidth is not an...

Hey all, IMO, lazy fit will be a fantastic addition. I had some real pain using the StandardScaler with the column transformer because it called compute for each column. I...

Further to my previous comment. A rough idea of a Column transformer implementation could be done thusly. This would just work with data in the form of dataframes, but that...