Enum Types
I would like to use NCDatasets to create files using enumerated types. As far as I can see from the documentation and the code, this is not supported. Would it be possible to add it?
You are right, enum types are currently not supported. It is certainly withing the scope and doable. It just takes a sometime to write and test the code. Here is some start to expose the low-level functions (https://github.com/Alexander-Barth/NCDatasets.jl/commit/f82c24afe92b327903beba6082b291b181382510).
I am wondering what should be the return type of the higher level function. Maybe a julia array of Symbols, or a CategoricalArray/PooledArrays/IndirectArrays... I am not so familiar with these array types.
Cool, thanks. for now I need to write rather than read the enum type, but this is still very helpfull.
"I am wondering what should be the return type of the higher level function. Maybe a julia array of Symbols, or a CategoricalArray/PooledArrays/IndirectArrays... I am not so familiar with these array types." Not sure, have never used them, but I guess it should be possible to use Julias @enum types
In NetCDF, an identifier (Clear in the example below) can appear in different enum types:
netcdf enum2 {
types:
byte enum cloud_t {Clear = 0, Cumulonimbus = 1, Stratus = 2,
Stratocumulus = 3, Cumulus = 4, Altostratus = 5, Nimbostratus = 6,
Altocumulus = 7, Missing = 127} ;
byte enum cloud2_t {Clear = 10, Cumulonimbus = 11} ;
dimensions:
time = UNLIMITED ; // (5 currently)
variables:
cloud_t primary_cloud(time) ;
cloud_t primary_cloud:_FillValue = Missing ;
}
However, julia doesn't let me do that:
julia> @enum cloud_t Clear=0
julia> @enum cloud_t2 Clear=10
ERROR: invalid redefinition of constant Clear
Stacktrace:
[1] top-level scope
@ Enums.jl:198
[2] top-level scope
@ REPL[5]:1
Also julia keywords can be a problem:
@enum cloud_t3 end=10
ERROR: syntax: extra token "end" after end of expression
Stacktrace:
[1] top-level scope
@ none:1
While julias @enum seem to be natural (after all they have the same name than NetCDF enums ;-) ), I am not sure if this is the best (or save) choice here.
I just check with python's netCDF4, and they are simply returning the numbers:
In [2]: import netCDF4
In [3]: ds = netCDF4.Dataset("enum.nc")
In [6]: ds["primary_cloud"][:]
Out[6]:
masked_array(data=[0, 2, 4, --, 1],
mask=[False, False, False, True, False],
fill_value=127,
dtype=int8)
In [7]: data = ds["primary_cloud"][:]
In [9]: data[0]
Out[9]: 0
In [10]: data[1]
Out[10]: 2
The same is true for python's xarray.
(For your information I updated test_enum.jl)