Should a Scene be a Mapping?
Feature Request
Is your feature request related to a problem? Please describe.
Scene objects currently behave a bit like mappings, but not quite. They support indexing, but looping yields values rather than keys. They have keys and values methods, but not items. They have __getitem__ and __iter__, but not __len__. Scene.keys() gives a list, but Scene.values() a dict_values object. Consistently implementing a Scene as a mapping may reduce surprising and inconsistent behaviour.
Describe the solution you'd like
I'd like a Scene to be a mapping, i.e. to implement the full Mapping (perhaps even MutableMapping) interface. Implementing the Mapping interface implies:
-
Scene.__iter__should yield keys, not values, -
Sceneneeds to implement__len__anditems, -
.keys()and.items()should return views, not lists.
Describe any changes to existing user workflow
Stuff will break. Not everything. Any code looping over a Scene will break. Some code using .keys() will also break if it relies on behaviour supported by sequences but not by sets. Specifically:
- Almost all calls
for x in scwill need to be changed tofor x in sc.values(); - some calls to
sc.keys()will need to be changed tolist(sc.keys()).
Additional context
A MultiScene should probably not be a Mapping but rather an ordered Collection of Mappings (it currently an Iterable but not a Collection, as it doesn't support __len__ or 'contains`, but that is a separate issue).
I think I'm fine with all of this. I mentioned on slack that I think the original purpose for making __iter__ return a sequence of values was that the DatasetIDs (the keys underneath the dictionary/mapping at the time) weren't really meant to be something everyday users needed access to.
A more direct example of surprising behaviour:
-
if x in sc:— checks keys -
for x in sc:— yields values
also something I find disturbing is that "x in sc" checks for approximate matches too.
Does it and shouldn't it?
https://github.com/pytroll/satpy/blob/979608ae6c5a9d443535953816c0d2ab002cfa16/satpy/dataset/data_dict.py#L234-L240
Doesn't the last line (which I think normally delegates to dict) exclude approximate matches? But shouldn't we want approximate matches to be included? Normally for a mapping I would expect
key in mapping
to be roughly equivalent to and perhaps implemented as
try:
mapping[key]
except KeyError:
return False
else:
return True
but as I read the DatasetDict implementation it's possible for key in sc to be False despite sc[key] to be a success, although I suppose it depends on how the hashing or equivalence checking is implemented on the types that can serve as keys (but aren't those always DataID when using the normal scene interface anyway...?).