Calibration levels naming
Following last community meeting, here is an issue for organizing our thoughts on calibration levels.
First: the current names for which there is no issue: counts (integer counts from L1 files), radiances (calibrated using parameters from the files), brightness temperature (again, calibrated following the parameters).
Second: the reflectances. Currently (if I understand well) this is the default calibration for visible bands. The current reflectance calibration is the "uncorrected one", that is mostly used in composites after going through the satpy.modifiers.SunZenithCorrector modifier.
Possible reflectance calibrations:
- Uncorrected reflectance (current reflectance)
-
toa_bidirectional_reflectance - "li shibata corrected"
Ideally, we should agree on a set of short names to be used internally for selecting the calibration mode and have it agree with CF quantities.
My understanding from the conversation 2 days ago is that the /cos(SZA) reflectance and the "li shibata corrected" reflectance are in the same category and therefore the same calibration "level".
It should also be noted that there is the calibration level and the name of the modifier. The calibration level is more about identifying/categorizing the data while the modifier is kind of describing what has been done to the data to get it to its current state. However, "normal" calibration steps (like those in the reader code) are not described by modifiers at this time. So currently we'll usually get something from the reader with calibration="reflectance" and then apply the sunz_corrected modifier. So you end up with an identifier like DataID(name="C01", calibration="reflectance", modifiers=("sunz_corrected",)). Later you might also get the "rayleigh_corrected" modifier applied and that is added on to the modifiers tuple.
Maybe I misunderstand, but (like I think Dave is saying) we need to be careful what we call these things.
digital_count, radiance, brightness_temperature and reflectance are calibrations. li_shibata and cos(sza) are not calibrations. We must be extremely careful not to confuse the two terms. There last two things are, just like Dave says, modifiers to a calibration.
Also, the calibrations are not standard names, so no need to be extremely precise in the naming, just document what they mean in general.
Doing a step back, what exactly is the definition of calibration? ISO 80000 defines quantities (temperature, radiance) and units (K, mW m¯² sr¯¹ (cm¯¹)¯¹). This does not map cleanly to standard names and calibrations. Do we have a rigorous definition of what we mean by calibration? Is it a synonym for a quantity? If yes, should we even rename it quantity or calibrated_quantity? Or is calibration something else? Modifiers do not usually change the quantity (except NIRReflectance). Does changing the quantity mean we should also change the calibration?
Does CF have a definition of how their definition of a standard name relates to the ISO 80000 definition of a quantity?
However, "normal" calibration steps (like those in the reader code) are not described by modifiers at this time.
🤔
What if we do remove calibration and describe everything in terms of modifiers?
DataID(name='HRV', wavelength=WavelengthRange(min=0.5, central=0.7, max=0.9, unit='µm'), resolution=1000.134348869, modifiers=("counts_to_radiance", "radiance_to_reflectance"))
In the attributes, standard name + units should fully describe what we have. Modifiers contain a trace of what has been done to the data since they were loaded from the file. What does calibration add?
As far as I see, modifiers do not have sufficient access to the files to handle calibration without a major overhaul of file handlers to supply a standardized calibration coefficients to the dataset attributes.
I am not proposing to move the calibration code outside the readers / file handlers. The calibration code would remain where it is now and no file handler would need to be updated, but the dataid bookkeeping would be via the modifiers rather than via the calibration. Similar to how MSI, OLCI L2, VIIRS reader YAML configs define modifiers for sunz and rayleigh corrections where the data have those corrections already applied.
I'm "thinking aloud", not really proposing anything concretely. Maybe what I'm thinking doesn't make sense. Currently, the list of modifiers in the DataID is a list of operations that were either already done before satpy reads the data (sunz_corrected for VIIRS SDR, rayleigh_corrected for OLCI L2), or performed by satpy after calibrating the data (any modifier defined in satpy). In principle, calibration steps performed by the FileHandler could also be described in the list of modifiers. This would make the "calibration" property redundant. Advantage: we would not need to find a name ;)
ok ok, I am quite used to open issues and have nobody reply for weeks :-D
- I agree with @djhoese those are not reflectance "levels". This was not my intention, more to have "qualifiers" (modifiers for satpy). Indeed, li-shibata and "1/cos(sza)" are at the same level but simply different.
- good point @simonrp84 that converting the quantity to a different physical quantity is not calibration. Only "radiance" (counts to radiance) and BT (radiance to BT) processes can be though as calibration (as in the BT there is a correspondance between the spectral response, radiance, and BT that is established by the instrument operator). Reflectance is so-so as the radiance to reflectance is "only" a ratio with the incoming solar irradiance in a way. (I won't argue either way to be honest).
- The idea of chaining modifiers (even if keeping the calibration within the reader) is interesting!
@gerritholl what you're saying makes sense and would probably work, but is inconvenient (read: annoying) for users and composite definitions. Every time we've suggested users specify modifiers=("sunz_corrected",) or every time a composite wants rayleigh correction (ex. modifiers=("sunz_corrected", "rayleigh_corrected"), they would now have to do modifiers=("counts_to_radiance", "radiance_to_reflectance", "sunz_corrected", "rayleight_corrected"). Plus, "counts_to_radiance" wouldn't apply to some readers depending on how you look at it. Some file formats are storing integers in the on-disk file but they aren't "digital counts" from the instrument even if some readers do call them counts. So in those case "counts_to_radiance" would have to be included for consistency and to work with the rest of the system. This would also mean users could never request modifiers=() (no modifier/calibration), but this is not a completely new idea.
Actually, it would also be difficult to implement too, not just difficult for users. This breaks the internal Satpy logic for preferring the "least modified" version of a dataset when it is not explicit/inferred. Like a Scene that has a pseudo-reflectance, a reflectance, and a rayleigh corrected reflectance version of "C01" and you say scn["C01"]. What gets returned when you have the "calibrations" included? The calibrations currently prefer the "highest" calibration:
https://github.com/pytroll/satpy/blob/6d843f67ac8bde26f437101ed7a82541b10d625c/satpy/dataset/dataid.py#L645-L650