satpy icon indicating copy to clipboard operation
satpy copied to clipboard

Should a Scene be a Mapping?

Open gerritholl opened this issue 5 years ago • 4 comments

Feature Request

Is your feature request related to a problem? Please describe.

Scene objects currently behave a bit like mappings, but not quite. They support indexing, but looping yields values rather than keys. They have keys and values methods, but not items. They have __getitem__ and __iter__, but not __len__. Scene.keys() gives a list, but Scene.values() a dict_values object. Consistently implementing a Scene as a mapping may reduce surprising and inconsistent behaviour.

Describe the solution you'd like

I'd like a Scene to be a mapping, i.e. to implement the full Mapping (perhaps even MutableMapping) interface. Implementing the Mapping interface implies:

  • Scene.__iter__ should yield keys, not values,
  • Scene needs to implement __len__ and items,
  • .keys() and .items() should return views, not lists.

Describe any changes to existing user workflow

Stuff will break. Not everything. Any code looping over a Scene will break. Some code using .keys() will also break if it relies on behaviour supported by sequences but not by sets. Specifically:

  • Almost all calls for x in sc will need to be changed to for x in sc.values();
  • some calls to sc.keys() will need to be changed to list(sc.keys()).

Additional context

A MultiScene should probably not be a Mapping but rather an ordered Collection of Mappings (it currently an Iterable but not a Collection, as it doesn't support __len__ or 'contains`, but that is a separate issue).

gerritholl avatar Oct 30 '20 11:10 gerritholl

I think I'm fine with all of this. I mentioned on slack that I think the original purpose for making __iter__ return a sequence of values was that the DatasetIDs (the keys underneath the dictionary/mapping at the time) weren't really meant to be something everyday users needed access to.

djhoese avatar Oct 30 '20 18:10 djhoese

A more direct example of surprising behaviour:

  • if x in sc: — checks keys
  • for x in sc: — yields values

gerritholl avatar Nov 10 '20 08:11 gerritholl

also something I find disturbing is that "x in sc" checks for approximate matches too.

mraspaud avatar Nov 10 '20 08:11 mraspaud

Does it and shouldn't it?

https://github.com/pytroll/satpy/blob/979608ae6c5a9d443535953816c0d2ab002cfa16/satpy/dataset/data_dict.py#L234-L240

Doesn't the last line (which I think normally delegates to dict) exclude approximate matches? But shouldn't we want approximate matches to be included? Normally for a mapping I would expect

key in mapping

to be roughly equivalent to and perhaps implemented as

try:
   mapping[key]
except KeyError:
    return False
else:
    return True

but as I read the DatasetDict implementation it's possible for key in sc to be False despite sc[key] to be a success, although I suppose it depends on how the hashing or equivalence checking is implemented on the types that can serve as keys (but aren't those always DataID when using the normal scene interface anyway...?).

gerritholl avatar Nov 10 '20 11:11 gerritholl