netcdf4-python icon indicating copy to clipboard operation
netcdf4-python copied to clipboard

Runtime Error: NetCDF: File not found

Open tnoelcke opened this issue 7 years ago • 17 comments

I'm working on reading some data from a Thredds server using OPeNDAP in a nedcdf4 format. The URL for the server I'm working with is here: https://climate.northwestknowledge.net/RangelandForecast/download.php When accesses some of the lat lon values in this data set I get this from the terminal:

Traceback (Most recent call last):
    File "getData.py", line 91 in <module>
        (latI, lonI) = getIndex(latTarget, lonTarget, lathandle, lonhandle, datahanlde)
    File "getData.py", line 62, in getIndex
        check = datahanlde[lat_index, lon_index, 0]
    File "netCDF4/_netCDF4.pyx" , line 3961, in netCDF4._netCDF4.Variable.__getitem__
    File "netCDF4/_netCDF4.pyx" , line 3961, in netCDF4._netCDF4.Variable.get
    File "netCDF4/_netCDF4.pyx" , line 3961, in netCDF4._netCDF4._ensure_nc_success
RuntimeError: NetCDF: file not found.

I'm running Bash Ubuntu 14.04 as a linux subsystem on a windows machine. I'm using conda v4.4.8 running python 2.7.14 I have hdf5 installed along with netcdf4 version 1.3.1

I can post the code if you feel you need it.

Thanks!

tnoelcke avatar Jan 29 '18 04:01 tnoelcke

That error usually means the file was not accessible for some reasons (either the server was down, or you don't have permissions to access it).

What is the actual URL you used? (the one you gave is not a valid opendap URL)

jswhit avatar Jan 29 '18 12:01 jswhit

Here is the URL i am using in my code: http://tds-proxy.nkn.uidaho.edu/thredds/dodsC/NWCSC_INTEGRATED_SCENARIOS_ALL_CLIMATE/bcsd-nmme/monthlyForecasts/bcsd_nmme_metdata_ENSMEAN_forecast_1monthAverage.nc

tnoelcke avatar Jan 29 '18 14:01 tnoelcke

The thing that i don't understand is that it will work for some lat long pairs but not others that are still inside the range that I know is stored at that server.

tnoelcke avatar Jan 29 '18 18:01 tnoelcke

Perhaps the server is flaking out at just the moment you are requesting those lat/lon pairs? The error is coming from the C library and not the python interface, so whatever is going on is probably not an issue on the python side.

jswhit avatar Jan 29 '18 19:01 jswhit

I think your right i think it must be an issue with the server I'm trying to connect to. Thanks for the help!

tnoelcke avatar Jan 29 '18 22:01 tnoelcke

After spending some time talking to the system administrator about the problem I was having with this read error on the netcdf4 file we discovered it was because of file chunking that I was getting a read error. I'm not sure if this is due to the Python interface or the C library but we don't get the same read errors in matlab. Additionally, I'm not the only person who has had this issue using the same system according to the system admin. Is there any thing special i need to do when working with chunked files?

Any help or pointers would be much appreciated.

tnoelcke avatar Feb 03 '18 21:02 tnoelcke

There's nothing you need to do to read chunked files - it's all handled in the HDF5 library. You can specify the chunksizes when writing, or let the library choose default values. There's not much we can do without a self-contained, reproducible example program that triggers the error you are seeing.

jswhit avatar Feb 03 '18 23:02 jswhit

I'm getting exactly the same error calling GFS data from NCEP. You can use this as the self contained example above. My feeling is that there is a timeout happening or the module cannot maintain the connection. Using a Grads script to call the same dods server for the same data request has no problems connecting and staying connected. I've also tried this on several different and widely separated machines so I'm pretty sure it's not port blocking, or local network related. (PS - using netCDF4 v 1.31)

import netCDF4
import numpy as np

mydate='20180509'

url2='http://nomads.ncep.noaa.gov:9090/dods/gfs_0p50/gfs'+ \
    mydate+'/gfs_0p50_00z'

print url2
#http://nomads.ncep.noaa.gov:9090/dods/gfs_0p50/gfs20180509/gfs_0p50_00z

#OPEN FILE
file2 = netCDF4.Dataset(url2)

#GET VARS  lat/lon/relhumidity
lat  = file2.variables['lat'][:]
lon  = file2.variables['lon'][:]
rh=    file2.variables['rhprs'] 

#LOCATION (SAN DIEGO)
latf=32.75
lonf=242.75

#FIND CLOSEST IDX
lonidx=np.abs(lon - lonf).argmin()
latidx=np.abs(lat - latf).argmin()

print latidx,lonidx
#245 485

print rh.shape
#(81, 47, 361, 720)

#EXTRACT DATA

#WORKS
rhpoint=rh[1,1,latidx,lonidx]
rhpoint=rh[1:5,1:5,latidx,lonidx]
rhpoint=rh[1:15,1:15,latidx,lonidx]
rhpoint=rh[1,:,latidx,lonidx]
rhpoint=rh[1:20,:,latidx,lonidx]
rhpoint=rh[1:30,:,latidx,lonidx]


#FAILS
rhpoint=rh[:,:,latidx,lonidx]
rhpoint=rh[1:50,:,latidx,lonidx]
rhpoint=rh[:,1:10,latidx,lonidx]
rhpoint=rh[:,1:20,latidx,lonidx]
rhpoint=rh[:,1:30,latidx,lonidx]

#FAILS return the following error:
#
# Traceback (most recent call last):
#   File "<stdin>", line 1, in <module>
#   File "netCDF4/_netCDF4.pyx", line 3961, in netCDF4._netCDF4.Variable.__getitem__
#   File "netCDF4/_netCDF4.pyx", line 4798, in netCDF4._netCDF4.Variable._get
#   File "netCDF4/_netCDF4.pyx", line 1638, in netCDF4._netCDF4._ensure_nc_success
# RuntimeError: NetCDF: file not found)

graemerae avatar May 11 '18 03:05 graemerae

I've seen errors with both C's ncdump and netCDF-java's Tools-UI when trying to get the rhprs array. Given that some requests work for you and some don't, it seems like the root cause is on the server. netcdf4-python could give a much better error in this case, though.

dopplershift avatar May 11 '18 18:05 dopplershift

Yeah I tried doing a similar thing in R using the R netcdf library and discovered that I essentially get the same error.

tnoelcke avatar May 11 '18 21:05 tnoelcke

My errors seem to be related to the overall size of the requested data set. (a 1x1 or 5x5 array - no problem, but a 30 x 50 array fails. Somewhere in between is the cut off. I'll talk to ncep - see if they have any ideas.

graemerae avatar May 12 '18 00:05 graemerae

PS - I also tried setting the .dodsrc timeout variables to something ridiculously large, (eg

HTTP.TIMEOUT=50000
CUROPT_TIMEOUT=10000

but it doesn't look like netcdf4-python honors those settings. (unless I'm missing something)

graemerae avatar May 12 '18 00:05 graemerae

@tnoelcke - what is the exact data request you are making to the server (or the exact slice you are using)?

lesserwhirls avatar May 15 '18 17:05 lesserwhirls

Have any of you solved the problem? I am experiencing the same issue.

DanielIAvila avatar Aug 15 '18 22:08 DanielIAvila

I wasn't able to solve this problem. I ended up setting up a chron job to download the entire file from the server rather than try and read it over the network. I did not have the same issues locally. Time became an issue for my project so I used that method instead.

tnoelcke avatar Aug 16 '18 23:08 tnoelcke

I had a similar issue with this dataset

the error arises only when trying to access to certain variables ['HW', 'PW', 'HWA', 'PWA'] - also ncdumps fails, (e,g,: ncdump -vHW https://thredds.met.no/thredds/dodsC/arcticdata/obsSynop/01361) I am clueless on what is the reason for the error, but I don't consider this a problem with xarray.

For now, I hacked my code this way ...

from netCDF4 import Dataset
import xarray as xr

nc_url = "https://thredds.met.no/thredds/dodsC/arcticdata/obsSynop/01361"

nc_fid = Dataset(nc_url, 'r')

for i in nc_fid.variables: 
try:
   nc_fid.variables[i][:] 
   valid_vars.append(i)
except RuntimeError: 
   print('skip:', i) 

ds = xr.open_dataset(nc_url)

df = ds[valid_vars].to_dataframe()

epifanio avatar Jun 17 '20 11:06 epifanio

@epifanio I get errors with those variables when I use netCDF-java (through ToolsUI) as well. Something is at fault in the server configuration used to aggregate the individual netcdf files together.

dopplershift avatar Jun 24 '20 19:06 dopplershift