netcdf-c
netcdf-c copied to clipboard
OSError: [Errno -128] NetCDF: Attempt to use feature that was not turned on when netCDF was built.
Attempting to open zarr file in an S3 bucket using the following python code:
from netCDF4 import Dataset
nc2 = Dataset("s3://mybucket/zarr_key/#mode=zarr,s3", "r")
When I attempt to open this, I get the following traceback:
Traceback (most recent call last):
File "printnetcdf.py", line 8, in <module>
nc2 = Dataset("s3://mybucket/zarr_key/#mode=zarr,s3", "r")
File "src/netCDF4/_netCDF4.pyx", line 2353, in netCDF4._netCDF4.Dataset.__init__
File "src/netCDF4/_netCDF4.pyx", line 1963, in netCDF4._netCDF4._ensure_nc_success
OSError: [Errno -128] NetCDF: Attempt to use feature that was not turned on when netCDF was built.: b's3://mybucket/zarr_key/#mode=zarr,s3'
This error is odd:
OSError: [Errno -128] NetCDF: Attempt to use feature that was not turned on when netCDF was built.: b's3://mybucket/zarr_key/#mode=zarr,s3'
It indicates that netcdf-c library was not built with (nc)zarr support enabled. Can you check the library installation, perhaps looking at libnetcdf.settings file?
@cryptoboxcomics How did you install netcdf4-python? Conda? Pip?
netcdf4 was installed using pip3. I was able to run this just fine with a local file store like "file:///directory/to/my/zarr/#mode=nczarr,zarr,file", but it seems like I'm only getting this issue when trying to access S3.
I'm not sure where I can get access to the libnetcdf.settings, but the command pip3 show netcdf4
displays the following:
Name: netCDF4 Version: 1.6.0 Summary: Provides an object-oriented python interface to the netCDF version 4 library. Home-page: http://github.com/Unidata/netcdf4-python Author: Jeff Whitaker Author-email: [email protected] License: License :: OSI Approved :: MIT License Location: /usr/local/lib64/python3.7/site-packages Requires: cftime, numpy Required-by:
What OS are you using? Do you know if when you installed it built from source or installed using a pre-built wheel (which would be my guess)?
I am not explicitly enabling nczarr-s3 support in the netcdf-c build that is used in the wheels. Should I?
@jswhit That's where I was going to go. It would be nice to have that support baked-in, though that will require having the aws C SDK available IIUC. @DennisHeimbigner ?
Turning S3 support by default, as you note, requires having the AWS C++ SDK installed. To date, I have only been able to get that to work using Ubuntu Linux. AWS SDK is way overkill for what we need, so I keep looking for a streamlined alternative to the AWS SDK, but so far, that does not appear to exist.
@DennisHeimbigner Had you seen this? https://github.com/awslabs/aws-c-s3
It's rough, but maybe vendoring it would be the lesser of two evils (the other "evil" being essentially useless S3 support)?
I looked at it some time ago. At that point, it had one of the most complex builds I had ever seen because it was divided into a myriad of separate modules. I never even got it to build on Linux. But, it may be worth revisiting to see if it now more build-able.
@dopplershift , this was built using a pre-built wheel. The OS that I'm using is Centos 7.
Sounds like @jswhit 's comment about not having enabled nczarr-s3 support when building the wheels is the cause here.
Hi all, I'm doing the same thing : I tried to read netcdf and nczarr files from s3 local server (ninja s3 server for instance). I compiled aws S3 SDK, netcdf-c and netcdf4-python I enable nczarr-s3 support
But this command falls into an infinite loop :
s3_url_dataset_nc="s3://localhost:9444/data/my_netcdf_file.nczarr"
dataset = netCDF4.Dataset(self.s3_url_dataset_nc + "#mode=s3,nczarr")
Any idea on what I'm doing wrong ? Thanks !
Out of curiosity @CedricPenard, what platform are you on, and which version of netCDF-C? We're working on getting v4.9.3 out, which improves s3 support. There are some tricky issues (not restricted to netCDF) when working with the Amazon S3 SDK, depending on the platform.
As a workaround, try changing
"s3://mybucket/zarr_key/#mode=zarr,s3"
to
"https://s3.amazon.com/mybucket/zarr_key/#mode=zarr,s3"
Out of curiosity @CedricPenard, what platform are you on, and which version of netCDF-C? We're working on getting v4.9.3 out, which improves s3 support. There are some tricky issues (not restricted to netCDF) when working with the Amazon S3 SDK, depending on the platform.
I am on Ubuntu 22.04 netCDF-c version : 4.9.3-development netcdf4-python version : 1.7.0
Finally it's not an infinite loop : a long time later I get this error : OSError: [Errno -138] NetCDF: S3 error: 's3://localhost:9444/data/SWOT_L2_LR_PreCalSSH_Expert_002_086_20230814T031152_20230814T040129_PIA1_01.nczarr#mode=nczarr,s3'
Same thing with https instead of s3 : OSError: [Errno -138] NetCDF: S3 error: 'https://localhost:9444/data/SWOT_L2_LR_PreCalSSH_Expert_002_086_20230814T031152_20230814T040129_PIA1_01.nczarr#mode=nczarr,s3'
Is a non amazon server supported ? I work with a local Ninja S3 server.
We would like to support local servers, but have no way to test it. Try this experiment. Execute this command and post the output.
ncdump -h '[log][show=fetch]https://localhost:9444/data/SWOT_L2_LR_PreCalSSH_Expert_002_086_20230814T031152_20230814T040129_PIA1_01.nczarr#mode=nczarr,s3'
Also, the issue may be that the aws-sdk-cpp library does not support non-amazon servers. Starting with netcdf-c-4.9.3, we have an alternate library that may work (or can be made to work) with non-amazon servers.
We would like to support local servers, but have no way to test it. Try this experiment. Execute this command and post the output.
ncdump -h '[log][show=fetch]https://localhost:9444/data/SWOT_L2_LR_PreCalSSH_Expert_002_086_20230814T031152_20230814T040129_PIA1_01.nczarr#mode=nczarr,s3'
Yes, seems non amazon s3 server is not supported by aws-sdk
ncdump -h '[log][show=fetch]https://localhost:9444/data/SWOT_L2_LR_PreCalSSH_Expert_002_086_20230814T031152_20230814T040129_PIA1_01.nczarr#mode=nczarr,s3'
ERR: curlCode: 28, Timeout was reached key=
ncdump: [log][show=fetch]https://localhost:9444/data/SWOT_L2_LR_PreCalSSH_Expert_002_086_20230814T031152_20230814T040129_PIA1_01.nczarr#mode=nczarr,s3: NetCDF: S3 error
Interesting, and good to note. If you were to check out the main
branch from Github and compile with the --enable-s3-internal
flag, you would be able to test the same URL to see if it works with the integrated S3-SDK alternative.
The compilation with -DENABLE_S3_INTERNAL is ok (I use cmake) I don't have the same error (and it's immediate) :
ncdump -h '[log][show=fetch]https://localhost:9444/nczarrdata/SWOT_L2_LR_PreCalSSH_Expert_002_086_20230814T031152_20230814T040129_PIA1_01.nczarr#mode=nczarr,s3'
ncdump: [log][show=fetch]https://localhost:9444/nczarrdata/SWOT_L2_LR_PreCalSSH_Expert_002_086_20230814T031152_20230814T040129_PIA1_01.nczarr#mode=nczarr,s3: NetCDF: Authorization failure
It's strange the bucket is public.
@CedricPenard Is localhost:9444 actually https protected? The ninja docs would seem to indicate that it is not, by default.
No I have let the default parameter.
What is the command line to give the key and secret ?
Hi I made some test with another s3 server.
Seems the url of server is not taken into account.
ncdump -h '[log][show=fetch]https://s3.datalake.cnes.fr/campus-rt-netcdfstreaming/SWOT_L2_HR_PIXC_509_011_242R_20230503T014506_20230503T014517_PIA1_01.nczarr#mode=s3'
NOTE: fetch: https://storage.googleapis.com/campus-rt-netcdfstreaming/SWOT_L2_HR_PIXC_509_011_242R_20230503T014506_20230503T014517_PIA1_01.nczarr.dds
syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or SCAN_ERROR
context: <?xml^ version='1.0' encoding='UTF-8'?><Error><Code>NoSuchBucket</Code><Message>The specified bucket does not exist.</Message></Error>
NOTE: fetch complete: 0.537 secs
ncdump: [log][show=fetch]https://s3.datalake.cnes.fr/campus-rt-netcdfstreaming/SWOT_L2_HR_PIXC_509_011_242R_20230503T014506_20230503T014517_PIA1_01.nczarr#mode=s3: NetCDF: file not found
Found the problem in the enum of NCS3SVC :
git diff include/ncs3sdk.h
diff --git a/include/ncs3sdk.h b/include/ncs3sdk.h
index 771faa66..b1dbf506 100644
--- a/include/ncs3sdk.h
+++ b/include/ncs3sdk.h
@@ -9,7 +9,7 @@
/* Track the server type, if known */
typedef enum NCS3SVC {NCS3UNK=0, /* unknown */
NCS3=1, /* s3.amazon.aws */
- NCS3GS=0 /* storage.googleapis.com */
+ NCS3GS=2 /* storage.googleapis.com */
} NCS3SVC;
Now I have this error :
ncdump -h '[log][show=fetch]https://s3.datalake.cnes.fr/campus-rt-netcdfstreaming/SWOT_L2_HR_PIXC_509_011_242R_20230503T014506_20230503T014517_PIA1_01.nczarr#mode=s3'
>>> NC_s3urlrebuild: final=[log][show=fetch]https://s3.datalake.cnes.fr/campus-rt-netcdfstreaming/SWOT_L2_HR_PIXC_509_011_242R_20230503T014506_20230503T014517_PIA1_01.nczarr#mode=s3 bucket=campus-rt-netcdfstreaming region=us-east-1
NOTE: fetch: https://s3.datalake.cnes.fr/campus-rt-netcdfstreaming/SWOT_L2_HR_PIXC_509_011_242R_20230503T014506_20230503T014517_PIA1_01.nczarr.dds
ERR: curl error: SSL peer certificate or SSH remote key was not OK
curl error details:
WARN: oc_open: Could not read url
NOTE: fetch complete: 0.025 secs
ncdump: [log][show=fetch]https://s3.datalake.cnes.fr/campus-rt-netcdfstreaming/SWOT_L2_HR_PIXC_509_011_242R_20230503T014506_20230503T014517_PIA1_01.nczarr#mode=s3: NetCDF: I/O failure
Part of the problem is that the URL you using is being treated as if it is a DAP2 URL. Try changing the "#mode=s3" at the end to "mode=zarr,s3" and see if gets any further along.
With #mode=zarr,s3 I have a "S3 error" without any other information.
Edit: Ok it's a problem with curl request and authentication. It seems that curl doesn't take into account ~/.aws/credentials I will try to see why.
When I put "https://s3.datalake.cnes.fr/" into my browser, it says the site does not exist.
Yes it's only accessible locally.
Hello,
NCH5_s3comms_load_aws_profile is not called. How and where credentials are managed ?
See the following functions:
- libdispatch/ds3util.c#NC_s3sdkinitialize()
- libdispatch/ds3util.c#NC_aws_load_credentialss()
- libdispatch/ds3util.c#NC_getactives3profile()
- libdispatch/ds3util.c#NC_authgets3profile()
If memory serves, (1) is called at initialization to load various environment variables, some of which affect the loading of profile information. Function (2) is called at initialization to load from .aws/config+.aws/credentials as controlled by the info loaded by (1). Function (3) is called at various points to get the current "active" profile as determined by URL, or various environment variables, or by the info read in (1) and (2). It uses (3) to search the list of loaded profiles.