satpy icon indicating copy to clipboard operation
satpy copied to clipboard

scn.available_dataset_ids() should indicate calibrations in a more user-friendly manner

Open gerritholl opened this issue 2 years ago • 13 comments

Describe the bug

I used scn.available_dataset_ids() to inquire the available calibrations for a particular channel. Unfortunately, the result was not helpful, as it just indicated calibrations <1>, <3>, and <4>.

To Reproduce

from satpy import Scene
from glob import glob
filenames = glob("W_XX-EUMETSAT-Darmstadt,IMG+SAT,MTI1+FCI-1C-RRAD-*.nc")
sc = Scene(filenames=filenames, reader="fci_l1c_nc")
print([x for x in sc.available_dataset_ids() if x["name"] == "vis_06"])

Expected behavior

I would expect parameters that I can pass directly to sc.load(["vis_06}', calibration=...), perhaps:

[DataID(name='vis_06', wavelength=WavelengthRange(min=0.59, central=0.64, max=0.69, unit='µm'), resolution=500, calibration="counts", modifiers=()),
 DataID(name='vis_06', wavelength=WavelengthRange(min=0.59, central=0.64, max=0.69, unit='µm'), resolution=500, calibration="radiance", modifiers=()),
 DataID(name='vis_06', wavelength=WavelengthRange(min=0.59, central=0.64, max=0.69, unit='µm'), resolution=500, calibration="reflectance", modifiers=())]

Actual results

In reality, I get numbers that I cannot meaningfully use or interpret as a human:

[DataID(name='vis_06', wavelength=WavelengthRange(min=0.59, central=0.64, max=0.69, unit='µm'), resolution=500, calibration=<1>, modifiers=()),
 DataID(name='vis_06', wavelength=WavelengthRange(min=0.59, central=0.64, max=0.69, unit='µm'), resolution=500, calibration=<3>, modifiers=()),
 DataID(name='vis_06', wavelength=WavelengthRange(min=0.59, central=0.64, max=0.69, unit='µm'), resolution=500, calibration=<4>, modifiers=())]

Environment Info:

  • Satpy Version: 0.42.3.dev162+g89891b30e.d20230629

Additional context

I guess it's an enum behind the scenes and that the value rather than the description of the enum is represented.

gerritholl avatar Aug 22 '23 11:08 gerritholl

Unless something is customized for FCI, this is supposed to be as you expect. It might show the enum, but I thought that enum had the string values. I wonder if I broke this when I worked on Scene serialization stuff a while ago. Any idea if it is this way (integers) for other readers?

djhoese avatar Aug 22 '23 13:08 djhoese

It's the same for SEVIRI:

from satpy import Scene
from glob import glob
seviri_files = glob("/media/nas/x21308/scratch/SEVIRI/202103300900/H-000*")
sc = Scene(filenames={"seviri_l1b_hrit": seviri_files})
print(sc.available_dataset_ids())

also gives results with calibration=<1>, calibration=<3>, etc.

gerritholl avatar Aug 22 '23 13:08 gerritholl

Yeah I see it for ABI too. What version of Python are you on?

djhoese avatar Aug 22 '23 13:08 djhoese

Should be an easy fix:

https://github.com/pytroll/satpy/blob/6326b0358a8941cb509f3cd14f761d64807c9e3d/satpy/dataset/dataid.py#L89-L91

djhoese avatar Aug 22 '23 13:08 djhoese

I'm on Python 3.11.4.

gerritholl avatar Aug 22 '23 13:08 gerritholl

This is what it would show if we didn't have that __repr__:

In [14]: dataid.IntEnum.__repr__(data_id["calibration"])
Out[14]: '<calibration.reflectance: 1>'

djhoese avatar Aug 22 '23 13:08 djhoese

Ah it's a Python 3.10 to 3.11 change, here is Python 3.10:

In [1]: from satpy import Scene; from glob import glob
scn
In [2]: scn = Scene(reader="abi_l1b", filenames=glob("/data/satellite/abi/20180711/OR_ABI-L1b-RadF-M3C02_G16_s20181921800440_e20181921811207_c20181921811242.nc"))

In [3]: scn.available_dataset_ids()
Out[3]:
[DataID(name='C02', wavelength=WavelengthRange(min=0.59, central=0.64, max=0.69, unit='µm'), resolution=500, calibration=<calibration.reflectance>, modifiers=()),
 DataID(name='C02', wavelength=WavelengthRange(min=0.59, central=0.64, max=0.69, unit='µm'), resolution=500, calibration=<calibration.radiance>, modifiers=()),
 DataID(name='C02', wavelength=WavelengthRange(min=0.59, central=0.64, max=0.69, unit='µm'), resolution=500, calibration=<calibration.counts>, modifiers=())]

djhoese avatar Aug 22 '23 14:08 djhoese

And it is documented:

https://docs.python.org/3/library/enum.html#enum.IntEnum

djhoese avatar Aug 22 '23 14:08 djhoese

so in that code snippet above, we should replace str(self) with repr(self), right?

mraspaud avatar Aug 31 '23 12:08 mraspaud

Depends what we want exactly and how compatible it is between Python versions. Do we want it to be "reflectance"? Or "<reflectance>"? Or something else?

djhoese avatar Aug 31 '23 12:08 djhoese

So I think if we do self.name in the __repr__ I referenced above this would fix it. @gerritholl Have time to make a pull request?

djhoese avatar Aug 31 '23 16:08 djhoese

shouldn't we change __str__ rather than __repr__? I think str should be more user friendly, while repr is more for serializing

mraspaud avatar Sep 01 '23 06:09 mraspaud

I don't care (nor do I necessarily know) what has to be changed, but I think we want an ipython session doing scn.available_dataset_ids() to show a DataID object with "reflectance" rather than <calibration.reflectance: 1>, right? ipython and other REPLs will be doing repr unless you do print which I think does __str__?

djhoese avatar Sep 01 '23 13:09 djhoese