thredds
thredds copied to clipboard
From ncML to netCDF file using netCDF-Java: FillValue is not handled properly
I am generating an empty netCDF file using the ncML description as input. However, I cannot define a FillValue.
This issue could be related to "Ncml mishandling signedness" #923.
I follow these steps:
- The following ncML code is the input description. It only defines 2 variables (short and unsigned short), and tries to set their FillValue attribute to "1". Note: the same problem is applicable to other data types as double, float, int, etc).
<?xml version="1.0" encoding="UTF-8"?>
<ncml:netcdf xmlns:ncml="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
<ncml:variable name="my_short" shape="" type="short">
<ncml:attribute name="_FillValue" type="short" value="1"/>
</ncml:variable>
<ncml:variable name="my_ushort" shape="" type="short">
<ncml:attribute name="_Unsigned" value="true" />
<ncml:attribute name="_FillValue" type="short" value="1" isUnsigned="true" />
</ncml:variable>
</ncml:netcdf>
- Generate netCDF using this command:
java -Xmx1g -classpath netcdfAll-4.6.11.jar ucar.nc2.write.Nccopy --input test_FillValue.ncml --output test_FillValue.nc --format netcdf4
(Note: I am using the latest netCDF-Java library, 4.6.11 from 4/Dec/2017; and Java 1.8.0).
- I dump the netCDF file with the usual command (netCDF-C version 4.3.2):
ncdump -s test_FillValue.nc
and the output is:
netcdf test_FillValue {
variables:
short my_short ;
my_short:_FillValue = 1s ;
my_short:_Endianness = "little" ;
ushort my_ushort ;
my_ushort:_Unsigned = "true" ;
my_ushort:_Endianness = "little" ;
// global attributes:
:_Format = "netCDF-4" ;
data:
my_short = -32767 ;
my_ushort = 32769 ;
}
The value for short equals NC_FILL_SHORT=-32767, however NC_FILL_USHORT=65535, not 32769. https://www.unidata.ucar.edu/software/netcdf/docs/netcdf_8h_source.html
And in any case, the "_FillValue" is defined in the ncML (="1"), therefore it should overwrite any default value.
Summarizing:
- FillValue attributes cannot be defined for any data type using the netCDF-Java library and ncML as an input (no problem with other attributes, or attributes starting by underscore like "_example").
- When the FillValue is defined, automatically all data is populated with that value (and therefore it requires some processing time). Am I correct?
- Default FillValue attributes for unsigned data types do not match their default values.
@DennisHeimbigner - can you take a look at this one? I seem to remember we fixed something like this awhile back, but I can't seem to find it.
Dear Sean Arms and Dennis Heimbigner,
Please let me give you some background information.
I submitted the following related request in September 2017:
[netCDFJava #PCM-232977]: netCDF-Java ignores "_FlllValue" attribute in ncML https://www.unidata.ucar.edu/support/help/MailArchives/netcdf/msg14081.html
However, when I had a look to it, I found another problem (I cannot define a particular “_FillValue” attribute value). And in addition, the status of the request above is already “closed”. Therefore I decided to submit this new request (#1036).
Thank you in advance. Daniel Risquez.
@risquez You need to set the /netcdf@enhance
attribute to "all"
. Enhancement is responsible for processing fill values, missing values, signededness conversion, etc. See this. If the /netcdf@enhance
attribute is not explicitly set in a NcML document, no enhancement is done. I'm not sure why that's the default; "all"
seems more useful, and less likely to be a gotcha for users. I'll bring it up in our meeting tomorrow.
So, suppose I have this NcML:
<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2" enhance="all">
<variable name="my_short" shape="" type="short">
<attribute name="_FillValue" type="short" value="1"/>
</variable>
<variable name="my_ushort" shape="" type="short">
<attribute name="_Unsigned" value="true" />
<attribute name="_FillValue" type="short" value="1" isUnsigned="true" />
</variable>
</netcdf>
If I use NetCDF-Java v4.6.11 to generate a NetCDF-4 file from it, as you have done, I get:
$ ncdump 1036_4.nc4
netcdf \1036_4 {
variables:
short my_short ;
my_short:_FillValue = 1s ;
ushort my_ushort ;
my_ushort:_Unsigned = "true" ;
data:
my_short = _ ;
my_ushort = 1 ;
}
Better, but my_ushort
's _FillValue
is still mishandled. Next I'll try on NetCDF-Java v5.0.0, which includes the signedness changes I made in #934:
$ ncdump 1036_5.nc4
netcdf \1036_5 {
variables:
short my_short ;
my_short:_FillValue = 1s ;
ushort my_ushort ;
my_ushort:_Unsigned = "true" ;
my_ushort:_FillValue = 1US ;
data:
my_short = _ ;
my_ushort = _ ;
}
Bingo. Everything seems to be working there (the underscore indicates that a datum matches the fill value declared for a variable). On version 5, you can also simplify your NcML somewhat. This will work just as well:
<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2" enhance="all">
<variable name="my_short" shape="" type="short">
<attribute name="_FillValue" type="short" value="1"/>
</variable>
<variable name="my_ushort" shape="" type="ushort">
<attribute name="_FillValue" type="ushort" value="1" />
</variable>
</netcdf>
Note that we haven't yet released v5.0, but we're getting close. If you'd like to experiment with a recent snapshot build, you can grab one here.
I tried both suggestions in my computer (add "enhanced" attribute and try the latest snapshot). Both work well as you indicate. Christian, thank you very much for your support.
I would like to recall this issue, just to summarize its status.
My problem is:
- I need to define the
FillValue
attribute, and in addition, - keep enumerations as enumerations (not strings, see #1042).
I understand that I have to choose:
- Either I define
enhancement="all"
in the ncML (as indicated above), and thereforeFillValue
is Ok, but enumerations are not Ok (explanation here). - Or I do not indicate
enhancement="all"
in the ncML, and thereforeFillValue
is not Ok, but enumerations are Ok (this was my situation before raising this issue).
Could you please confirm that my understanding is correct? Both (FillValue
and enumerations) cannot be as I need at the same time.
@risquez
try with enhance="ScaleMissing"
it is supposed only to apply scale/offset and missing enhancements and not the enums. The opposite should be enhance="ConvertEnums"
, i.e. convert enums to strings and no scale/offset and missing enhancements
This issue is still alive and well for us at EUMETSAT, as when NetcdfDataset.Enhance
is set to ScaleMissingDefer
, netCDF-Java "promotes" enums to strings, although ScaleMissingDefer
shouldn't touch enums.
Using ScaleMissing
isn't an option, because the data should not be converted.
@risquez made a nice overview using:
<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
<variable name="my_short" shape="" type="short">
<attribute name="_FillValue" type="short" value="1"/>
</variable>
<variable name="my_ushort" shape="" type="short">
<attribute name="_Unsigned" value="true" />
<attribute name="_FillValue" type="short" value="1" isUnsigned="true" />
</variable>
</netcdf>
Setting different values for enhance
results in:
Attribute | NetcdfDataset.Enhance not set |
NetcdfDataset.Enhance == "ScaleMissing" |
NetcdfDataset.Enhance == "all" |
---|---|---|---|
valid_range | yes | no | no |
add_offset | yes | no | no |
scale_factor | yes | no | no |
_FillValue | no | no | no |
_unsigned | yes | yes | yes |
Resulting data type | native short | native unsigned short | native float |
As you can see, the attributes are preserved correctly, but no _FillValue
is applied, without enhancements. Using enhancements breaks other parts of the data as well.
At the moment, this is a barrier for us to use netCDF-Java in our operations for Meteosat Third Generation.