thredds icon indicating copy to clipboard operation
thredds copied to clipboard

From ncML to netCDF file using netCDF-Java: FillValue is not handled properly

Open risquez opened this issue 7 years ago • 7 comments

I am generating an empty netCDF file using the ncML description as input. However, I cannot define a FillValue.

This issue could be related to "Ncml mishandling signedness" #923.

I follow these steps:

  1. The following ncML code is the input description. It only defines 2 variables (short and unsigned short), and tries to set their FillValue attribute to "1". Note: the same problem is applicable to other data types as double, float, int, etc).
<?xml version="1.0" encoding="UTF-8"?>
<ncml:netcdf xmlns:ncml="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">

   <ncml:variable name="my_short" shape="" type="short">
      <ncml:attribute name="_FillValue" type="short" value="1"/>
   </ncml:variable>

   <ncml:variable name="my_ushort" shape="" type="short">
      <ncml:attribute name="_Unsigned" value="true" />
      <ncml:attribute name="_FillValue" type="short" value="1" isUnsigned="true" />
   </ncml:variable>

</ncml:netcdf>
  1. Generate netCDF using this command:

java -Xmx1g -classpath netcdfAll-4.6.11.jar ucar.nc2.write.Nccopy --input test_FillValue.ncml --output test_FillValue.nc --format netcdf4

(Note: I am using the latest netCDF-Java library, 4.6.11 from 4/Dec/2017; and Java 1.8.0).

  1. I dump the netCDF file with the usual command (netCDF-C version 4.3.2):

ncdump -s test_FillValue.nc

and the output is:

netcdf test_FillValue {
variables:
        short my_short ;
                my_short:_FillValue = 1s ;
                my_short:_Endianness = "little" ;
        ushort my_ushort ;
                my_ushort:_Unsigned = "true" ;
                my_ushort:_Endianness = "little" ;
// global attributes:
                :_Format = "netCDF-4" ;
data:
 my_short = -32767 ;
 my_ushort = 32769 ;
}

The value for short equals NC_FILL_SHORT=-32767, however NC_FILL_USHORT=65535, not 32769. https://www.unidata.ucar.edu/software/netcdf/docs/netcdf_8h_source.html

And in any case, the "_FillValue" is defined in the ncML (="1"), therefore it should overwrite any default value.

Summarizing:

  • FillValue attributes cannot be defined for any data type using the netCDF-Java library and ncML as an input (no problem with other attributes, or attributes starting by underscore like "_example").
  • When the FillValue is defined, automatically all data is populated with that value (and therefore it requires some processing time). Am I correct?
  • Default FillValue attributes for unsigned data types do not match their default values.

risquez avatar Feb 15 '18 17:02 risquez

@DennisHeimbigner - can you take a look at this one? I seem to remember we fixed something like this awhile back, but I can't seem to find it.

lesserwhirls avatar Feb 20 '18 23:02 lesserwhirls

Dear Sean Arms and Dennis Heimbigner,

Please let me give you some background information.

I submitted the following related request in September 2017:

[netCDFJava #PCM-232977]: netCDF-Java ignores "_FlllValue" attribute in ncML https://www.unidata.ucar.edu/support/help/MailArchives/netcdf/msg14081.html

However, when I had a look to it, I found another problem (I cannot define a particular “_FillValue” attribute value). And in addition, the status of the request above is already “closed”. Therefore I decided to submit this new request (#1036).

Thank you in advance. Daniel Risquez.

risquez avatar Feb 21 '18 08:02 risquez

@risquez You need to set the /netcdf@enhance attribute to "all". Enhancement is responsible for processing fill values, missing values, signededness conversion, etc. See this. If the /netcdf@enhance attribute is not explicitly set in a NcML document, no enhancement is done. I'm not sure why that's the default; "all" seems more useful, and less likely to be a gotcha for users. I'll bring it up in our meeting tomorrow.

So, suppose I have this NcML:

<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2" enhance="all">
   <variable name="my_short" shape="" type="short">
       <attribute name="_FillValue" type="short" value="1"/>
   </variable>

   <variable name="my_ushort" shape="" type="short">
      <attribute name="_Unsigned" value="true" />
      <attribute name="_FillValue" type="short" value="1" isUnsigned="true" />
   </variable>
</netcdf>

If I use NetCDF-Java v4.6.11 to generate a NetCDF-4 file from it, as you have done, I get:

$ ncdump 1036_4.nc4 
netcdf \1036_4 {
variables:
	short my_short ;
		my_short:_FillValue = 1s ;
	ushort my_ushort ;
		my_ushort:_Unsigned = "true" ;
data:
 my_short = _ ;
 my_ushort = 1 ;
}

Better, but my_ushort's _FillValue is still mishandled. Next I'll try on NetCDF-Java v5.0.0, which includes the signedness changes I made in #934:

$ ncdump 1036_5.nc4 
netcdf \1036_5 {
variables:
	short my_short ;
		my_short:_FillValue = 1s ;
	ushort my_ushort ;
		my_ushort:_Unsigned = "true" ;
		my_ushort:_FillValue = 1US ;
data:
 my_short = _ ;
 my_ushort = _ ;
}

Bingo. Everything seems to be working there (the underscore indicates that a datum matches the fill value declared for a variable). On version 5, you can also simplify your NcML somewhat. This will work just as well:

<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2" enhance="all">
   <variable name="my_short" shape="" type="short">
       <attribute name="_FillValue" type="short" value="1"/>
   </variable>

   <variable name="my_ushort" shape="" type="ushort">
      <attribute name="_FillValue" type="ushort" value="1" />
   </variable>
</netcdf>

Note that we haven't yet released v5.0, but we're getting close. If you'd like to experiment with a recent snapshot build, you can grab one here.

cwardgar avatar Feb 22 '18 07:02 cwardgar

I tried both suggestions in my computer (add "enhanced" attribute and try the latest snapshot). Both work well as you indicate. Christian, thank you very much for your support.

risquez avatar Feb 22 '18 09:02 risquez

I would like to recall this issue, just to summarize its status.

My problem is:

  • I need to define the FillValue attribute, and in addition,
  • keep enumerations as enumerations (not strings, see #1042).

I understand that I have to choose:

  • Either I define enhancement="all" in the ncML (as indicated above), and therefore FillValue is Ok, but enumerations are not Ok (explanation here).
  • Or I do not indicate enhancement="all" in the ncML, and therefore FillValue is not Ok, but enumerations are Ok (this was my situation before raising this issue).

Could you please confirm that my understanding is correct? Both (FillValue and enumerations) cannot be as I need at the same time.

risquez avatar Mar 12 '18 09:03 risquez

@risquez try with enhance="ScaleMissing"it is supposed only to apply scale/offset and missing enhancements and not the enums. The opposite should be enhance="ConvertEnums", i.e. convert enums to strings and no scale/offset and missing enhancements

cofinoa avatar Mar 12 '18 17:03 cofinoa

This issue is still alive and well for us at EUMETSAT, as when NetcdfDataset.Enhance is set to ScaleMissingDefer, netCDF-Java "promotes" enums to strings, although ScaleMissingDefer shouldn't touch enums.

Using ScaleMissing isn't an option, because the data should not be converted.

@risquez made a nice overview using:

<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">

   <variable name="my_short" shape="" type="short">
       <attribute name="_FillValue" type="short" value="1"/>
   </variable>

   <variable name="my_ushort" shape="" type="short">
      <attribute name="_Unsigned" value="true" />
      <attribute name="_FillValue" type="short" value="1" isUnsigned="true" />
   </variable>
</netcdf>

Setting different values for enhance results in:

Attribute NetcdfDataset.Enhance not set NetcdfDataset.Enhance == "ScaleMissing" NetcdfDataset.Enhance == "all"
valid_range yes no no
add_offset yes no no
scale_factor yes no no
_FillValue no no no
_unsigned yes yes yes
Resulting data type native short native unsigned short native float

As you can see, the attributes are preserved correctly, but no _FillValue is applied, without enhancements. Using enhancements breaks other parts of the data as well.

At the moment, this is a barrier for us to use netCDF-Java in our operations for Meteosat Third Generation.

erget avatar Apr 12 '18 13:04 erget