netcdf4-python Error opening netcdf in path with "special" characters

Hi,

I could not open a netcdf file whose path contained a German umlaut in Python 3 (FileNotFoundError: [Errno 2] No such file or directory). A workaround was to change the folder and use only the filename:

import netCDF4 as nc
s = "C:/path_with_ä/test.nc"
ncf = nc.Dataset(s) # error
import os
path, fn = os.path.split(s)
os.chdir(path)
ncf = nc.Dataset(fn) # worked

I tested v1.5.1.2 under Python 3.7.2 (64bit, Win10) as installed via conda-forge. Interestingly, using a unicode string in Python 2 worked well with netcdf4 v1.4.1.

Jun 24 '19 20:06 itati01

This works for me in Python 3.6

from netCDF4 import Dataset
filename = '\xc3\xbc.nc'
nc = Dataset(filename, 'w')
nc.close()

Jun 26 '19 13:06 jswhit

Works for me as well. However, the file shows up as "ÃƒÂ¼.nc" in Windows Explorer but "Ã¼.nc" under Linux (and in Win Explorer, I used Ubuntu 18.04 via WSL).

Now, filename = "ää.nc" results in "Ã¤Ã¤.nc" under Windows but "ää.nc" under Linux. Simulating the behaviour above, I used filename = "äää/aaa.nc" which results in an error under Windows (Errno 13: Permission denied) while everything is fine under Linux (the path shows up correctly in Win Explorer as well). os.makedirs("äää") works as expected.

Jun 26 '19 14:06 itati01

Not a Unicode expert, but netcdf4-python uses utf-8 encoding by default (can be changed with the encoding Dataset kwarg). Maybe Windows uses a different encoding?

Jun 26 '19 14:06 jswhit

This is related to https://github.com/Unidata/netcdf4-python/issues/686.

There is actually a test for this for windows (tst_filepath.py). I suggest you try using encoding=sys.getfilesystemencoding().

Jun 26 '19 14:06 jswhit

Looks indeed like an unicode issue, although I am far from being an expert. sys.getfilesystemencoding() returns "utf-8" in Python 3, under Linux and Windows.

In Python 2, using filename = u"ää.nc" is working under Windows but filename = u"ää/ää.nc only if the folder "ää" already exists, e.g.

import os
filename = u'ää/ää.nc'
path, fn = os.path.split(filename)
if not os.path.exists(path):
    os.makedirs(path)    # os handles non-ASCII characters correctly
from netCDF4 import Dataset
nc = Dataset(filename, 'w') # fn is also fine
nc.close()

The path names also appear correctly in Win Explorer. sys.getfilesystemencoding() returns "mbcs" here.

Jun 26 '19 15:06 itati01

Is the problem resolved for python 3 on windows? I'm not clear on what works and what doesn't work.

Jun 27 '19 15:06 jswhit

Sorry for the confusion. No, the problem is not solved for Python 3 on Win. Python 2 on Win partly and Python 3 on Linux fully work. The unicode issues on Win seem to result in wrong (Py 3) albeit valid file names (your example) but invalid folder names (Py 2+3, my examples).

Jun 27 '19 17:06 itati01

OK, thanks for the clarification. Not having access to Windows I'm not sure where to go from here. One question that comes to mind is whether the same issue arises if you try to open a text file in Windows (independent of netcdf4-python)?

Jun 27 '19 18:06 jswhit

For reference

https://github.com/h5py/h5py/issues/839

Not sure if this is related or not, but it's a nice discussion of the general problem.

Jun 28 '19 13:06 jswhit

Also

https://forum.hdfgroup.org/t/non-english-characters-in-hdf5-file-name/4627/3

Seems clear that unicode filenames are not fully supported in HDF5 on windows as of yet.

Jun 29 '19 15:06 jswhit

Thanks for the link to the interesting discussion. So, let's hope that they might fix this issue at some time. By the way, creating a folder and writing to a new text file with os.makedirs() and write() works as expected.

Jul 01 '19 11:07 itati01

netcdf4-python netcdf4-python copied to clipboard

Error opening netcdf in path with "special" characters

netcdf4-python
netcdf4-python copied to clipboard