netcdf-c
netcdf-c copied to clipboard
OPeNDAP+HTTPS: certificates not found
With the new NetCDF 4.8.1 updated in Julia, we have an issue to accessing OPeNDAP resources over HTTPs. It seems that no certificate authority is loaded by default. In the new version of NetCDF 4.8.1 we also link against a new version of libCURL (now 7.81.0 previously 7.73.0).
-
the version of the software with which you are encountering an issue NetCDF 4.8.1
-
environmental information (i.e. Operating System, compiler info, java version, python version, etc.)
Ubuntu 20.04 and x86_64-linux-gnu-gcc (GCC) 4.8.5 (compilation is done within a docker container)
- a description of the issue with the steps needed to reproduce it
nc_open (called by NCDataset) with a HTTPS url results in the following error:
julia> using NCDatasets
julia> NCDataset("https://erddap.ifremer.fr/erddap/griddap/SDC_GLO_CLIM_TS_V2_1")
Error:curl error: SSL peer certificate or SSH remote key was not OK
curl error details:
Warning:oc_open: Could not read url
ERROR: NetCDF error: Opening path https://erddap.ifremer.fr/erddap/griddap/SDC_GLO_CLIM_TS_V2_1: NetCDF: I/O failure (NetCDF error code: -68)
Stacktrace:
[1] nc_open(path::String, mode::UInt16)
[...]
If I define a ~/.ncrc file with the following content HTTP.SSL.CAINFO, then OPENDAP over HTTPS works:
HTTP.SSL.CAINFO=/etc/ssl/certs/ca-certificates.crt
I am wondering if somebody can shed some light how it worked before (as in NetCDF 4.7.4/libcurl 7.73.0 such configuration file was not needed) and if it is possible to restore the previous behavior.
As noted by @visr, some libraries (like GDAL, Proj4) expose a function to set this path programmatically. This solution would also be ideal for NetCDF in Julia.
(Any insight on how this is solved in e.g. anaconda or R would be very useful.)
Reference: https://github.com/Alexander-Barth/NCDatasets.jl/issues/173 https://github.com/JuliaPackaging/Yggdrasil/issues/4843
I can't see anything in the libcurl change log: https://curl.se/changes.html Perhaps this is an OPENSSL change?
I'm not sure, but I suspect this change might be on the julia side since other libraries that use curl also need to set this path on Linux.
As noted by @visr, some libraries (like GDAL, Proj4) expose a function to set this path programmatically.
I should've looked better but it seems this already exists; ocset_curlopt. And if I'm not mistaken ocopen needs to be called first to create the OCstate.
I tried using these two functions, but got stuck on an assertion error from netcdf:
drc.c:80: ncrc_setrchome: Assertion `ncrc_globalstate && ncrc_globalstate->home' failed.
More details are at https://github.com/Alexander-Barth/NCDatasets.jl/pull/174
You should not be using any oc function; they were made internal long ago and can only be used in the context of libncdap. Why is your use of .ncrc not acceptable?
For me (as a user of netcdf in my research work), it is certainly acceptable to manually define this configuration file .ncrc.
But as a developer of a julia wrapper, I am a bit worried about all the users hitting this error and not knowing what to do. Some of them will use NetCDF as an indirect dependencies and are not aware that a library down the software stack requires some configuration. The error message seems to indicate that there is a problem with the server certificate.
We considered to use the environment variable DAPRCFILE (pointing to a temporary configuration file), but that could be problematic too.
Good point. As a developer, I sometimes forget to take the naive user into account. In include/ncrc.h there is a technically internal operation to insert a new key/value pair into the internal copy of .ncrc,
EXTERNL int NC_rcfile_insert(const char* key, const char* value, const char* hostport, const char* path);
You might try it as an experiment to see if it works for your problem. You can set the last two parameters to NULL. You will also need to declare the function signature in your code. I will try to think of the proper way to expose equivalent functionality.
Thanks for the tip! I tried NC_rcfile_insert, but have not yet been successful. The certificates are not found, but that seems not too surprising, given that when I try to check the value with NC_rclookup, I get a NULL string back.
The code is here: https://github.com/Alexander-Barth/NCDatasets.jl/blob/9817458ebd95cd90606181846b1fd76b2ab76f18/src/NCDatasets.jl#L40-L52
Which prints
NC_rcfile_insert returns 0
lookup = Cstring(0x0000000000000000)
Am I forgetting something, or is it supposed to work like this?
Thanks @visr, for your test with NC_rcfile_insert and NC_rclookup.
I had look with the gdb debugger while running NC_rcfile_insert from a julia session.
When reaching this line https://github.com/Unidata/netcdf-c/blob/v4.8.1/libdispatch/drc.c#L532, just before returning, globalstate->rcinfo.triples was surprisingly still a null pointer.
549 return ret;
(gdb) p ret
$1 = 0
(gdb) p globalstate->rcinfo.triples
$2 = (NClist *) 0x0
(gdb) p rc
$3 = (NClist *) 0x11e2550
However, when I force globalstate->rcinfo.triples=rc before returning,
(gdb) p globalstate->rcinfo.triples=rc
$4 = (NClist *) 0x11e2550
subsequent lookup do find the HTTP.SSL.CAINFO key:
julia> lookup = unsafe_string(@ccall(libnetcdf.NC_rclookup(key::Cstring, hostport::Cstring, path::Cstring)::Cstring))
"/etc/ssl/certs/ca-certificates.crt"
(And also OPENDAP over HTTPS works without defining a configuration file)
Could it be that NC_rcfile_insert should update globalstate->rcinfo.triples when a new list was created with nclistnew?
Maybe something similar to this? https://github.com/Unidata/netcdf-c/blob/v4.8.1/libdispatch/drc.c#L338
With this PR https://github.com/Unidata/netcdf-c/pull/2352/commits/defa1533705650ba5910caa1fd890a868f37d795 we can now use the private API NC_rcfile_insert to set HTTP.SSL.CAINFO. I am wondering if there is any chance to have a public API for this option?
Thanks for the reminder. I plan to expand that API to a little better access to the rc file: e.. insert, find, remove
Unfortunately, I see this failure now when trying this on NetCDF 4.9.0.
julia: drc.c:117: ncrc_setrchome: Assertion `ncg && ncg->home' failed.
signal (6): Aborted
in expression starting at /home/runner/work/_temp/4a1f6304-e640-43[7](https://github.com/Alexander-Barth/NCDatasets.jl/runs/6857358120?check_suite_focus=true#step:9:8)6-[8](https://github.com/Alexander-Barth/NCDatasets.jl/runs/6857358120?check_suite_focus=true#step:9:9)503-7bb4267810ff:1
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x7fa[9](https://github.com/Alexander-Barth/NCDatasets.jl/runs/6857358120?check_suite_focus=true#step:9:10)d6d43728)
__assert_fail at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
ncrc_setrchome at /workspace/srcdir/netcdf-c-4.9.0/libdispatch/drc.c:117
NC_rcload at /workspace/srcdir/netcdf-c-4.9.0/libdispatch/drc.c:187
ncrc_initialize at /workspace/srcdir/netcdf-c-4.9.0/libdispatch/drc.c:[10](https://github.com/Alexander-Barth/NCDatasets.jl/runs/6857358120?check_suite_focus=true#step:9:11)1
NC_rcfile_insert at /workspace/srcdir/netcdf-c-4.9.0/libdispatch/drc.c:573
init_certificate_authority at /home/runner/work/NCDatasets.jl/NCDatasets.jl/src/netcdf_c.jl:2[13](https://github.com/Alexander-Barth/NCDatasets.jl/runs/6857358120?check_suite_focus=true#step:9:14)4
It seems that ncg->home is still NULL after calling NC_getglobalstate():
#4 0x00007fff97dacd35 in ncrc_setrchome () at drc.c:117
117 assert(ncg && ncg->home);
(gdb) p ncg
$1 = (NCglobalstate *) 0x14cda50
(gdb) p ncg->home
$2 = 0x0
Maybe I need to call now NCDISPATCH_initialize to ensure that the global state is fully initialized?
This is windows/mingw, correct?
This issue is now on Linux (Ubuntu 20.04), also reproduced on CI:
https://github.com/Alexander-Barth/NCDatasets.jl/runs/6857358120?check_suite_focus=true
It might be also present on other platforms, but I did not reached the same point on Windows or Mac OS (all 3 fail, but they fail earlier).
Can you run the following command in the docker to see if the file is accessible at all?
wget "https://erddap.ifremer.fr/erddap/griddap/SDC_GLO_CLIM_TS_V2_1.dds"
or you can use curl if you prefer In any case, post the result.
This is the output inside the docker container:
sandbox:${WORKSPACE}/srcdir/netcdf-c-4.9.0 # wget "https://erddap.ifremer.fr/erddap/griddap/SDC_GLO_CLIM_TS_V2_1.dds"
--2022-06-13 19:13:59-- https://erddap.ifremer.fr/erddap/griddap/SDC_GLO_CLIM_TS_V2_1.dds
Resolving erddap.ifremer.fr (erddap.ifremer.fr)... 134.246.142.39
Connecting to erddap.ifremer.fr (erddap.ifremer.fr)|134.246.142.39|:443... connected.
HTTP request sent, awaiting response... 200 200
Length: 2417 (2.4K) [text/plain]
Saving to: 'SDC_GLO_CLIM_TS_V2_1.dds'
SDC_GLO_CLIM_TS_V2_1.dds 100%[===================================================================================>] 2.36K --.-KB/s in 0s
2022-06-13 19:13:59 (149 MB/s) - 'SDC_GLO_CLIM_TS_V2_1.dds' saved [2417/2417]
sandbox:${WORKSPACE}/srcdir/netcdf-c-4.9.0 # head SDC_GLO_CLIM_TS_V2_1.dds
Dataset {
Float64 time[time = 12];
Float64 depth[depth = 45];
Float64 latitude[latitude = 641];
Float64 longitude[longitude = 1440];
GRID {
ARRAY:
Float32 Temperature[time = 12][depth = 45][latitude = 641][longitude = 1440];
MAPS:
Float64 time[time = 12];
We use docker to compile the NetCDF and use the resulting binary libnetcdf.so.XY without docker. I see this error actually outside of the docker environment (same results of the wget test).
Thanks. The fact that it fails on linux is odd. I can see why it might fail on windows. Did you try with an older version of libcurl?
Julia also depends on libcurl, so the libcurl version is kind of fixed. Calling the private NC_rcfile_insert to set HTTP.SSL.CAINFO as you suggested here is indeed a good approach (even better of course would be a public API :-)) . This worked well in NetCDF 4.8.1 but in NetCDF 4.9.0 there is an additional assert which now fails:
$ git checkout v4.8.1
$ grep -r 'ncg->home' . # no output
# grep assert ./libdispatch/drc.c # no output
$ git checkout v4.9.0
$ grep -r 'ncg->home' .
./libdispatch/drc.c: assert(ncg && ncg->home); # <- our failing assert
./libdispatch/drc.c: tmp = ncg->home;
If I call NCDISPATCH_initialize from Julia before calling NC_rcfile_insert, ncg->home gets correctly initialized and NC_rcfile_insert now also work in with NetCDF 4.9.0, but I would like to minimize the number of private functions that we have to call.
Here is the proposed API for managing the .rc table:
Add two functions to netcdf.h to allow programs to walk and modify the internal .rc tables. This should fix the above issue by allowing HTTP.CAINFO to be set to the certificates directory. Note that the changes should be performed as early as possible in the program because some of the .rc table entries may get cached internally and changing the entry after that caching occurs may have no effect.
The new signatures are as follows:
- Iterate over the entries in the .rc table
int nc_rc_ith(int index, char* const * keyp, char* const* valuep, char* const* hostportp, char* const* urlpathp);
@param index which table entry to return
@param keyp table entry key field -- may be NULL
@param valuep table entry value field -- may be NULL
@param hostportp table entry hostport field (of form host | host:port) -- may be NULL
@param urlpathp table entry urlpath field -- may be NULL
@return NC_NOERR if no error
@return NC_EINVAL if index out of range
- Append/Overwrite/Delete Entry in .rc table
int nc_rc_replace(const char* key, const char* hostport, const char* urlpath, const char* value);
@param key table entry key field -- may not be NULL
@param value table entry value field -- NULL => delete
@param hostport table entry hostport field (of form host | host:port) -- may be NULL
@param urlpath table entry urlpath field -- may be NULL
@return NC_NOERR if no error
This API seems perfect to me! (but I cannot tell if there are enough const and * in the function signature as my C knowledge is quite limited ;-) )
I am wondering if the strings passed to nc_rc_replace can be freed safely after this function returned or are they expected to stay in memory (which is a bit more difficult to with the garbage collector in julia (and other languages)) ?
Yeh, const are confusing. I think I will change it so that the return values are copies of the internal strings so the user must free them. I will change the documenation to note that.
For the replace fcn, the strings are copied internally, so the user can reclaim the arguments if desired. I will change the documenation to not that.
Yes, handling the memory as you propose makes the most sense to me too!
I changed the API to significantly simplify it and to cover the most common cases.
Unfortunately, I also see this failed assertion when I call nc_rc_set from Julia (first call to the NetCDF library after loading the library) using the PR https://github.com/Unidata/netcdf-c/pull/2408.
https://github.com/Unidata/netcdf-c/issues/2337#issuecomment-1153640947
$ LANG=C julia-1.8 --eval 'ENV["JULIA_DEBUG"] = "NCDatasets"; using NCDatasets; NCDataset("https://opendap.puertos.es/thredds/dodsC/tidegauge_alge/2022/06/MIR2Z_Algeciras_Alge_3541_202206_analysis_event.nc4")'
┌ Debug: nc_rc_set: set HTTP.SSL.CAINFO to /etc/ssl/certs/ca-certificates.crt
└ @ NCDatasets ~/.julia/dev/NCDatasets/src/netcdf_c.jl:2174
julia-1.8: drc.c:171: ncrc_setrchome: Assertion `ncg && ncg->home' failed.
signal (6): Aborted
in expression starting at none:1
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x7fa61c5d7489)
(OS: Linux and $HOME defined)
I am closing this issue because it is likely to be the same as: https://github.com/Unidata/netcdf-c/issues/2486
The option -std=c99 (used here) results that the function strdup is undefined and any operation on strings like using the home path leads in a crash.
Thank you very much for implementing the new API and your work in NetCDF in general !