DAP server error when trying to subset a variable
We are seeing an inconsistent NC_EDAPSVC error when trying read a subset from variables in http://opendap.nccs.nasa.gov:80/dods/GEOS-5/fp/0.25_deg/assim/inst3_3d_asm_Np file.
For example, the error happens if trying to subset into the variable right away, but doesn't happen if you first read the entire variable. In this file, seems like all variables except for lev, lat, and lon have this issue.
Here is standalone reproduction code that uses "time" variable as an example:
#include <iostream>
#include <vector>
#include "netcdf.h"
void checkErrorCode(int status, const char* funcName){
if (status != NC_NOERR){
std::cout << "Error code: " << status << " from " << funcName << std::endl;
std::cout << nc_strerror(status) << std::endl << std::endl;
}
}
int main(int argc, const char * argv[]) {
// open file
int ncid = 0;
int retval = nc_open("http://opendap.nccs.nasa.gov:80/dods/GEOS-5/fp/0.25_deg/assim/inst3_3d_asm_Np", NC_NOWRITE, &ncid);
checkErrorCode(retval, "nc_open");
// get ID for the variable
int varid = 0;
retval = nc_inq_varid(ncid, "time", &varid);
checkErrorCode(retval, "nc_inq_varid");
std::cout << "Variable ID is " << varid << std::endl;
// // read the whole variable - works fine
// // (it is a 1D variable with total count of 21214)
// std::cout << std::endl << "Reading the whole 'time' variable" << std::endl;
// const int totalCount = 21214;
// std::vector<double> wholeVar(totalCount, 0.0);
//
// retval = nc_get_var_double(ncid, varid, wholeVar.data());
// checkErrorCode(retval, "nc_get_var_double");
// read a subset of the variable - errors out
// (it is a 1D variable with total count of 21214)
std::cout << std::endl << "Reading a subset of the 'time' variable" << std::endl;
const int subsetCount = 10;
std::vector<double> subsetVar(subsetCount, 0.0);
size_t start[] = {0};
size_t count[] = {subsetCount};
ptrdiff_t stride[] = {1};
retval = nc_get_vars_double(ncid, varid, start, count, stride, subsetVar.data());
checkErrorCode(retval, "nc_get_vars_double");
// read the whole variable - works fine
// (it is a 1D variable with total count of 21214)
std::cout << std::endl << "Reading the whole 'time' variable" << std::endl;
const int totalCount = 21214;
std::vector<double> wholeVar(totalCount, 0.0);
retval = nc_get_var_double(ncid, varid, wholeVar.data());
checkErrorCode(retval, "nc_get_var_double");
// close file
retval = nc_close(ncid);
checkErrorCode(retval, "nc_close");
return retval;
}
Here is the output this code produces (the subsetting errors out, but reading the whole variable works):
% ./readSubsetFromDAP.o
Variable ID is 0
Reading a subset of the 'time' variable
oc_open: server error retrieving url: code=0 message="constraint parsing failed; The variable `time%5b0' was not found in the dataset."Error code: -70 from nc_get_vars_double
NetCDF: DAP server error
Reading the whole 'time' variable
But if I switch the order and read the whole variable first and then do the subsetting, the error does not happen at all. I've also seen that sometimes subsetting in a way that still reads the whole variable (start=0, stride=1, count= dimension size) does not error, but sometimes it does.
I see some potentially related existing discussion about URL encoding and special characters like "[" ("%5b" mentioned in the error above): https://github.com/Unidata/netcdf-c/issues/1425#issuecomment-508207239 .
Would appreciate any suggestions on what might be the issue here and any possible workarounds!
We are using netCDF 4.9.2.
@DennisHeimbigner is there anything here that might be server side that occurs to you immediately?
A bit hard to trace this down definitively (due to some curl and openSSL upgrades we did around the same time), but we might have started seeing this error after netCDF v4.7.3
Also, if it helps, here is the relevant part of the output log when using setenv CURLOPT_VERBOSE 1:
Reading a subset of the 'time' variable
* Found bundle for host: 0x557d9e7f2040 [serially]
* Can not multiplex, even if we wanted to
* Re-using existing connection #0 with host opendap.nccs.nasa.gov
> GET /dods/GEOS-5/fp/0.25_deg/assim/inst3_3d_asm_Np.dods?lev,lat,lon HTTP/1.1
Host: opendap.nccs.nasa.gov
User-Agent: oc4.9.2
Accept: */*
Accept-Encoding: deflate, gzip, br, zstd
< HTTP/1.1 301 Moved Permanently
< Date: Tue, 11 Mar 2025 12:46:41 GMT
< Server: Apache
< Strict-Transport-Security: max-age=31536000
< Location: https://opendap.nccs.nasa.gov/dods/GEOS-5/fp/0.25_deg/assim/inst3_3d_asm_Np.dods?lev,lat,lon
< Content-Length: 300
< Content-Type: text/html; charset=iso-8859-1
<
* Ignoring the response-body
* Connection #0 to host opendap.nccs.nasa.gov left intact
* Clear auth, redirects to port from 80 to 443
* Issue another request to this URL: 'https://opendap.nccs.nasa.gov/dods/GEOS-5/fp/0.25_deg/assim/inst3_3d_asm_Np.dods?lev,lat,lon'
* Found bundle for host: 0x557d9e7f2940 [serially]
* Can not multiplex, even if we wanted to
* Re-using existing connection #1 with host opendap.nccs.nasa.gov
> GET /dods/GEOS-5/fp/0.25_deg/assim/inst3_3d_asm_Np.dods?lev,lat,lon HTTP/1.1
Host: opendap.nccs.nasa.gov
User-Agent: oc4.9.2
Accept: */*
Accept-Encoding: deflate, gzip, br, zstd
< HTTP/1.1 200
< Date: Tue, 11 Mar 2025 12:46:40 GMT
< Server:
< Strict-Transport-Security: max-age=31536000
< XDODS-Server: dods/3.2
< XDAP: 3.2
< Content-Description: dods_data
< Last-Modified: Tue, 11 Mar 2025 11:56:17 GMT
< Content-Type: application/octet-stream
< Strict-Transport-Security: max-age=63072000; includeSubDomains;
< Transfer-Encoding: chunked
<
* Connection #1 to host opendap.nccs.nasa.gov left intact
* Found bundle for host: 0x557d9e7f2040 [serially]
* Can not multiplex, even if we wanted to
* Re-using existing connection #0 with host opendap.nccs.nasa.gov
> GET /dods/GEOS-5/fp/0.25_deg/assim/inst3_3d_asm_Np.dods?time%5b0:9%5d HTTP/1.1
Host: opendap.nccs.nasa.gov
User-Agent: oc4.9.2
Accept: */*
Accept-Encoding: deflate, gzip, br, zstd
< HTTP/1.1 301 Moved Permanently
< Date: Tue, 11 Mar 2025 12:46:41 GMT
< Server: Apache
< Strict-Transport-Security: max-age=31536000
< Location: https://opendap.nccs.nasa.gov/dods/GEOS-5/fp/0.25_deg/assim/inst3_3d_asm_Np.dods?time%255b0:9%255d
< Content-Length: 306
< Content-Type: text/html; charset=iso-8859-1
<
* Ignoring the response-body
* Connection #0 to host opendap.nccs.nasa.gov left intact
* Clear auth, redirects to port from 80 to 443
* Issue another request to this URL: 'https://opendap.nccs.nasa.gov/dods/GEOS-5/fp/0.25_deg/assim/inst3_3d_asm_Np.dods?time%255b0:9%255d'
* Found bundle for host: 0x557d9e7f2940 [serially]
* Can not multiplex, even if we wanted to
* Re-using existing connection #1 with host opendap.nccs.nasa.gov
> GET /dods/GEOS-5/fp/0.25_deg/assim/inst3_3d_asm_Np.dods?time%255b0:9%255d HTTP/1.1
Host: opendap.nccs.nasa.gov
User-Agent: oc4.9.2
Accept: */*
Accept-Encoding: deflate, gzip, br, zstd
< HTTP/1.1 200
< Date: Tue, 11 Mar 2025 12:46:41 GMT
< Server:
< Strict-Transport-Security: max-age=31536000
< XDODS-Server: dods/3.2
< XDAP: 3.2
< Content-Description: dods_error
< Last-Modified: Tue, 11 Mar 2025 11:56:17 GMT
< Cache-Control: no-cache
< Content-Type: text/plain
< Content-Length: 123
< Strict-Transport-Security: max-age=63072000; includeSubDomains;
<
* Connection #1 to host opendap.nccs.nasa.gov left intact
oc_open: server error retrieving url: code=0 message="constraint parsing failed; The variable `time%5b0' was not found in the dataset."Error code: -70 from nc_get_vars_double
NetCDF: DAP server error
@DennisHeimbigner is there anything here that might be server side that occurs to you immediately?
Let me take a look.
Hello, This is the most similar case I could find, so I am posting my case here, suspected that they might have similar causes. My code used to work perfectly for a few days, but 3 days ago, it suddenly started to raise the DAP server error for accessing a subset from this dataset: https://hydro1.gesdisc.eosdis.nasa.gov/dods/NLDAS_FORA0125_H.002
Here is my code that suddenly stopped working:
import xarray as xr
def access_opendap_subset(dataset_url, start_date, end_date, bbox, variables):
"""
Access a subset of any dataset via OPeNDAP using xarray.
Parameters:
- dataset_url (str): The OPeNDAP URL of the dataset.
- start_date (str): Start date in 'YYYY-MM-DD' format.
- end_date (str): End date in 'YYYY-MM-DD' format.
- bbox (tuple): Bounding box as (min_lon, min_lat, max_lon, max_lat).
- variables (list): List of variable names to extract.
Returns:
- xarray.Dataset: The subset dataset.
"""
# Open the dataset via OPeNDAP
ds = xr.open_dataset(dataset_url, decode_times=True)
# Subset by time
ds = ds.sel(time=slice(start_date, end_date))
# Subset by spatial bounding box
min_lon, min_lat, max_lon, max_lat = bbox
ds = ds.sel(lon=slice(min_lon, max_lon), lat=slice(min_lat, max_lat))
# Select specified variables
ds = ds[variables]
return ds
nldas_url = "https://hydro1.gesdisc.eosdis.nasa.gov/dods/NLDAS_FORA0125_H.002"
start_date = "2016-01-29"
end_date = "2016-01-30"
bounding_box = (-105.375, 38.965, -93.975, 44.025) # (min_lon, min_lat, max_lon, max_lat)
variables = ['ugrd10m', 'vgrd10m'] # 10-m above ground Zonal and Meridional wind speed
nldas_subset = access_opendap_subset(nldas_url, start_date, end_date, bounding_box, variables)
print(nldas_subset)
Below is a more detailed error I receive:
oc_open: server error retrieving url: code=0 message="not an available dataset"
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
File ~/miniconda3/envs/geotools/lib/python3.12/site-packages/xarray/backends/file_manager.py:211, in CachingFileManager._acquire_with_cache_info(self, needs_lock)
[210](https://file+.vscode-resource.vscode-cdn.net/Users/babak.jfard/projects/PFIS/notebooks/~/miniconda3/envs/geotools/lib/python3.12/site-packages/xarray/backends/file_manager.py:210) try:
--> [211](https://file+.vscode-resource.vscode-cdn.net/Users/babak.jfard/projects/PFIS/notebooks/~/miniconda3/envs/geotools/lib/python3.12/site-packages/xarray/backends/file_manager.py:211) file = self._cache[self._key]
[212](https://file+.vscode-resource.vscode-cdn.net/Users/babak.jfard/projects/PFIS/notebooks/~/miniconda3/envs/geotools/lib/python3.12/site-packages/xarray/backends/file_manager.py:212) except KeyError:
File ~/miniconda3/envs/geotools/lib/python3.12/site-packages/xarray/backends/lru_cache.py:56, in LRUCache.__getitem__(self, key)
[55](https://file+.vscode-resource.vscode-cdn.net/Users/babak.jfard/projects/PFIS/notebooks/~/miniconda3/envs/geotools/lib/python3.12/site-packages/xarray/backends/lru_cache.py:55) with self._lock:
---> [56](https://file+.vscode-resource.vscode-cdn.net/Users/babak.jfard/projects/PFIS/notebooks/~/miniconda3/envs/geotools/lib/python3.12/site-packages/xarray/backends/lru_cache.py:56) value = self._cache[key]
[57](https://file+.vscode-resource.vscode-cdn.net/Users/babak.jfard/projects/PFIS/notebooks/~/miniconda3/envs/geotools/lib/python3.12/site-packages/xarray/backends/lru_cache.py:57) self._cache.move_to_end(key)
KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('https://hydro1.gesdisc.eosdis.nasa.gov/dods/NLDAS_FORA0125_H.002',), 'r', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False)), 'fa99d556-5da7-400c-9cc8-c7fea234aebb']
During handling of the above exception, another exception occurred:
OSError Traceback (most recent call last)
Cell In[4], [line 9](vscode-notebook-cell:?execution_count=4&line=9)
[6](vscode-notebook-cell:?execution_count=4&line=6) variables = ['ugrd10m', 'vgrd10m'] # 10-m above ground Zonal and Meridional wind speed
[8](vscode-notebook-cell:?execution_count=4&line=8) # Access the dataset using the function
----> [9](vscode-notebook-cell:?execution_count=4&line=9) nldas_subset = access_opendap_subset(nldas_url, start_date, end_date, bounding_box, variables)
[11](vscode-notebook-cell:?execution_count=4&line=11) # Print dataset metadata
[12](vscode-notebook-cell:?execution_count=4&line=12) print(nldas_subset)
Cell In[3], [line 18](vscode-notebook-cell:?execution_count=3&line=18)
...
File src/netCDF4/_netCDF4.pyx:2463, in netCDF4._netCDF4.Dataset.__init__()
File src/netCDF4/_netCDF4.pyx:2026, in netCDF4._netCDF4._ensure_nc_success()
OSError: [Errno -70] NetCDF: DAP server error: b'https://hydro1.gesdisc.eosdis.nasa.gov/dods/NLDAS_FORA0125_H.002'
Hi @DennisHeimbigner - sorry to nudge you, but anything you can say so far? Any additional data we can provide? Thank you for taking a look!
Kris
oc_open: server error retrieving url: code=0 message="not an available dataset"
This indicates that the client is getting an HTTP code of 0 from the server. I would have thought this was impossible, so I need to dig deeper..
Ok, the error message is being generated by the server and recognized by the DAP2 code in libnetcdf-c. So, the problem must be investigated on the server side.
Thank you so much, Dennis! Any thoughts on why we might not be seeing this issue with netCDF prior to v4.7.3 (+ the older versions of curl and openSSL)?
Ok, this is a bit complicated.
First, from the ncdump/netcdf-c point of view, this request:
ncdump -v "RetrievalGeometry_retrieval_longitude" "https://oco2.gesdisc.eosdis.nasa.gov/opendap/OCO2_L2_Standard.11.2r/2024/107/oco2_L2StdND_52084a_240416_B11205r_240610060300.h5"
is ambiguous (sort of).
The netcdf-c code looks at the URL for clues as to whether it is DAP2 or DAP4. Since there are no explicit clues, it decides it is DAP2 and make the appropriate .dds and .das requests to get the meta-data, which the server interprets correctly.
So, the DAP2, does a prefetch of data using certain rules to decide what to prefetch. For netcdf-c DAP2 code, it creates a pre-fetch list that is very long, which I think is what you are seeing and which causes the error.
You can, instead, force the URL to be interpreted as DAP4, by suffixing the URL with '#dap4'. In this case, the code is smarter and does not ask for prefetch, but rather just asks for the specified variable by appending "?dap4.ce=/RetrievalGeometry_retrieval_longitude" to the URL.
So, if you are accessing the URL as DAP4, then you can append '#dap4' and you should get what you want (modulo any other failures detected),'
If on the other hand, you want DAP2, it is possible to suppress prefetch by either prefixing the URL with "[nopretch]" or suffixing it with "#noprefetch". This would solve the immediate problem, but it appears to be failing later on with a different failure.
@DennisHeimbigner The problem solved with changing the link. It seems like the initial link in my code is expired now. So, today the original link in my code did not work, even could not open it in browser. So, I searched and replaced the link below in my code. Also, I had to replace variable names 'ugrd10m' and 'vgrd10m' into 'wind_e' and 'wind_n' respectively.The code started to work smoothly again!
https://hydro1.gesdisc.eosdis.nasa.gov/dods/NLDAS_FORA0125_H.2.0
Thank you for your help and response.
Hi @DennisHeimbigner - thank you for your help!
oc_open: server error retrieving url: code=0 message="not an available dataset"
This indicates that the client is getting an HTTP code of 0 from the server. I would have thought this was impossible, so I need to dig deeper..
Based on the quote in your response here, your investigation was focused on the issue posted by @Babakjfard . I am seeing a slightly different error:
oc_open: server error retrieving url: code=0 message="constraint parsing failed; The variable `time%5b0' was not found in the dataset."Error code: -70 from nc_get_vars_double
NetCDF: DAP server error
Just wanted to double-check - does your analysis and response apply to both issues (and the assumption is that they are similar/same)? I understand that in both of these cases the HTTP code from the server was 0. But it looked like in my case it might have something to do with parsing the square brackets in the url ("[" corresponding to code %5b) when trying to get a subset of the variable. To me, this sounds a bit similar to https://github.com/Unidata/netcdf-c/pull/1439 ?
I tried both dap4://opendap.nccs.nasa.gov:80/dods/GEOS-5/fp/0.25_deg/assim/inst3_3d_asm_Np and http://opendap.nccs.nasa.gov:80/dods/GEOS-5/fp/0.25_deg/assim/inst3_3d_asm_Np#noprefetch. In the first case I got Memory allocation (malloc) failure (NC_ENOMEM) and in the second the similar DAP server error (NC_EDAPSVC).
Sorry if I am missing something!
Let me investigate.
Hi @DennisHeimbigner - did you get a chance to take a look?
Not yet I got side tracked on another issue. I have time today.
What happens when you try this command:
wget http://opendap.nccs.nasa.gov:80/dods/GEOS-5/fp/0.25_deg/assim/inst3_3d_asm_Np.dmr
It looks to me like this dataset:
http://opendap.nccs.nasa.gov:80/dods/GEOS-5/fp/0.25_deg/assim/inst3_3d_asm_Np
is not a dap4 dataset at all, but only a DAP2 dataset.
What happens when you try this command:
wget http://opendap.nccs.nasa.gov:80/dods/GEOS-5/fp/0.25_deg/assim/inst3_3d_asm_Np.dmr
It downloads a inst3_3d_asm_Np.dmr file with the following html content:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<title>GrADS Data Server - error</title>
</head>
<body bgcolor="#ffffff">
<a href="http://opendap.nccs.nasa.gov:80/dods">GrADS Data Server</a>
<br>
<h2>GrADS Data Server - error</h2>
<hr><br>
The server could not fulfill this request:<p><b>check your url, please</b><p> because of the following error:<p>
<b>wrong URL or bad injection !!!!</b><p>
Check the syntax of your request, or click <a href=".help">here</a> for help using the server.
<br>
<hr><font size="-1"><a href="http://www.iges.org/grads/gds">
GrADS Data Server</a> 2.0 (<a href="http://opendap.nccs.nasa.gov:80/dods/help">help using this server</a>)
. </font>
<br>
</body>
</html>
It looks to me like this dataset:
http://opendap.nccs.nasa.gov:80/dods/GEOS-5/fp/0.25_deg/assim/inst3_3d_asm_Np is not a dap4 dataset at all, but only a DAP2 dataset.
Do you think DAP2 vs DAP4 plays into the issue I am seeing? I am still puzzled that reading the whole variable works - only trying to subset causes the error.
Any thoughts on why "[" seems to be treated as part of the variable name?
oc_open: server error retrieving url: code=0 message="constraint parsing failed; The variable `time%5b0' was not found in the dataset."Error code: -70 from nc_get_vars_double
NetCDF: DAP server error
At this point I do no know what to think. I suspect that Grads is not handling the requests correctly. But I do not know for sure. I guess I will keep on investigating, as should you..
Looking at the URL from one of the outputs above (with CURLOPT_VERBOSE 1):
https://opendap.nccs.nasa.gov/dods/GEOS-5/fp/0.25_deg/assim/inst3_3d_asm_Np.dods?time%255b0:9%255d
the OPeNDAP URL is encoded twice. The URL that is generated internally, which is likely:
https://opendap.nccs.nasa.gov/dods/GEOS-5/fp/0.25_deg/assim/inst3_3d_asm_Np.dods?time[0:9]
is encoded a first time and '[' and ']' become '%5b' and '%5d' respectively, which is correct, but then there is a second encoding that converts '%' into '%25', and we get these '%255b' and '%255d' in the final URL that are incorrect.
I need to know the complete command: E.g. wget VS curl VS ncdump. also are you asking for DAP2 or DAP4? It looks like wget/curl and asking for dap2.
https://opendap.nccs.nasa.gov/dods/GEOS-5/fp/0.25_deg/assim/inst3_3d_asm_Np.dods?time[0:9]
Hi Dennis,
I've attached the reproduction code in my original post above (we initially saw the issue with nc_get_vars_double ). I have also provided output with CURLOPT_VERBOSE 1 above (I have NOT tried using curl directly - I do not know enough to try). And I have responded with what I get with wget.
Using ncdump was not part of our workflow, but I can take some time to set it up and try it on this URL if you think that would be helpful for your investigation.
I am not sure how to "ask" for DAP2 vs DAP4 beyond modifying the URL.. You've advised @Babakjfard above to try suffixing the URL with '#dap4' and/or suffixing it with "#noprefetch" and I have tried that above as well. But in the original repro steps I am just using the URL as is.
Are you able to reproduce the issue on your side with the provided reproduction steps and information?
If so, do you see the "double-encoding" issue that @cwannaz pointed out when running the netCDF-C reproduction steps with setenv CURLOPT_VERBOSE 1?
Do let me know what else I can provide (and I apologize for my lack of expertise in DAP workflows).
Ok, apparently a feature got lost from the documentation, so there is a work around. Try adding 'encode=' to the query as a fragment. So for example:
http://opendap.nccs.nasa.gov:80/dods/GEOS-5/fp/0.25_deg/assim/inst3_3d_asm_Np.dmr?time[0:10]#encode=
The encode flag turns off URL (%xx) encoding for a request and then turns on selected encoding. So for example:
- encode= turns off all encoding
- encode=path causes special characters in the URL path to be encoded
- encode=query ditto for encoding query part of the URL
- encode=path,query to encode both
- encode=all is same as path,query
Actually, it is documented in the NUG (see DAP2.md in the Unidata/netcdf repo).
Thank you, @DennisHeimbigner - using #encode= to turn off encoding does seem to resolve the issue!