strictyaml
strictyaml copied to clipboard
Difficult to use Optional("key", None)
Great project, just a few usability issues.
Assume the following schema:
import strictyaml as s
s.load("a: 8", s.Map({
s.Optional("a", 1): s.Int(),
s.Optional("key_a", None): s.Str(),
s.Optional("key_b", {}): s.EmptyDict() | s.Map({})
}))
A few issues with usability arise:
- Why is
a: 8
required from the YAML file? It seems like a blank file (or one consisting only of comments) should be perfectly valid with this schema. - The result is
YAML(OrderedDict([('a', 8), ('key_b', {})]))
. It is confusing thatkey_a
is missing entirely from the results, despite having a default value ofNone
. A common pattern for fixing this in Python is to defineOptional
using e.g.:
missing = {} # Creates a unique object
class Optional:
def __init__(self, key, default=missing):
...
# Later
if optional.default is missing:
# User specified no default, OK not to include in output
In this way, Optional
with no default specified can result in the absence of the key, while Optional
with any default would always be present in the output.
Ah, one more I forgot about: there seems to be no way to specify an empty dictionary within the yaml file: hi: {}
of course violates the no-flow ideology of StrictYAML. I agree with no flows in general, but some way to specify a blank dictionary is important in some cases (e.g. Seq(Map({Optional('k'): Str()}))
, where all values in the Map
are optional).
Why is a: 8 required from the YAML file? It seems like a blank file (or one consisting only of comments) should be perfectly valid with this schema.
You have stated that there is a mapping, therefore there needs to be a mapping. A mapping must contain at least one key value pair, otherwise it's not a mapping. You can have all keys being optional if you like but at least one must be used.
If you want a blank file to be valid you can just do EmptyNone() | Map({...}) (or EmptyDict, to have it parse to an empty dict).
The result is YAML(OrderedDict([('a', 8), ('key_b', {})])). It is confusing that key_a is missing entirely from the results, despite having a default value of None
Granted. I'll have a think over this edge case.
Ah, one more I forgot about: there seems to be no way to specify an empty dictionary within the yaml file: hi: {} of course violates the no-flow ideology of StrictYAML. I agree with no flows in general, but some way to specify a blank dictionary is important in some cases (e.g. Seq(Map({Optional('k'): Str()})), where all values in the Map are optional).
I believe you're looking for EmptyDict, which will treat a scalar value with nothing there as an empty dict ("x:").
You have stated that there is a mapping, therefore there needs to be a mapping. A mapping must contain at least one key value pair, otherwise it's not a mapping
Why is that, anyway? I can accept it, but no other language / data structure makes the distinction between an empty map and a mapping with values. A mapping it just a type of data structure, no reason it shouldn't be empty. The functionality added by Optional
means that strictyaml
covers the use cases of empty and mandatory non-empty mappings, without needing a separate EmptyDict
class (which is also named strangely; why no EmptyMap
?).
If you want a blank file to be valid you can just do EmptyNone() | Map({...}) (or EmptyDict, to have it parse to an empty dict).
Good to know.
I believe you're looking for EmptyDict, which will treat a scalar value with nothing there as an empty dict ("x:").
I still have a minor grievance with this, in that a schema is then required to produce an empty dict, but I'll admit it's a corner case.
Why is that, anyway? I can accept it, but no other language / data structure makes the distinction between an empty map and a mapping with values. A mapping it just a type of data structure, no reason it shouldn't be empty.
Two reasons:
-
Paring down YAML to the bare minimum and losing flow style meant losing the ability to represent non-empty mappings without schemas. I'm pretty ok with this since blank string suffices as a replacement. I have been using it as such.
-
I think the distinction between an empty mapping and a nominally blank value is somewhat confusing for a non-developer, and StrictYAML, unlike YAML, is supposed to be super simple for non developers to use.
I still have a minor grievance with this, in that a schema is then required to produce an empty dict, but I'll admit it's a corner case.
You can also not use a schema and just manually convert empty string to empty dict.
Though StrictYAML is not really mean to be used without schema for anything halfway complicated.
Hmm, so you're saying that all of the Empty*
schemas were built with empty strings in mind, rather than e.g. the dict which is returned being empty. That makes a bit more sense.
I think the distinction between an empty mapping and a nominally blank value is somewhat confusing for a non-developer, and StrictYAML, unlike YAML, is supposed to be super simple for non developers to use.
Absolutely, but there's no need to do so at the cost of convenience when developing with StrictYAML. My main issue with usability is that Map({...})
refuses to validate {}
, e.g. when the Map
is the value attached to an Optional('key_to_map', {})
, and the Map
has all optional keys:
import strictyaml as s
s.load('a: 8', s.Map({'a': s.Int(), s.Optional('b', {}): s.Map({s.Optional('key', 8): s.Int()})}))
# InvalidOptionalDefault: Optional default for 'b' failed validation
If you don't want to support a blank string in this case, fine, but I don't think it should be necessary to add EmptyDict
to the schema in this case.
Hmm, so you're saying that all of the Empty* schemas were built with empty strings in mind, rather than e.g. the dict which is returned being empty. That makes a bit more sense.
Precisely. Since there's no clear one to one relationship between empty dict and YAML you have to use the schema to nail down what it should actually be - is empty dict the same as not having a key? Or is it having a key with an empty string? Or, is it just meaningful at all?
This discussion clearly indicates that the docs could be a bit clearer on this point, however.
Absolutely, but there's no need to do so at the cost of convenience when developing with StrictYAML. My main issue with usability is that Map({...}) refuses to validate {}, e.g. when the Map is the value attached to an Optional('key_to_map', {}), and the Map has all optional keys:
So, a schema is used not only to parse StrictYAML to data but also to serialize data to StrictYAML. There's is, by necessity, a symmetric reflexive relation here. Any optional parameter has to be something that could unambiguously be serialized into StrictYAML and parsed back again with no loss of meaning or data. Should an empty dict be serialized as a key with an empty string? No key at all? Or is it simply invalid? Should an empty string be parsed as a dict? Should it be an empty string? None?
This is what having a schema nails down - and nails it down in such a way that subtle bugs don't creep in (e.g. erroneously getting back None where you actually expected an empty dict). It might seem like onerous work up front but the strictness is supposed to protect you from subtle bugs later on and give you absolute confidence in what you're parsing or serializing, as well as the ability to trivially and safely roundtrip data.
If you don't want to support a blank string in this case, fine, but I don't think it should be necessary to add EmptyDict to the schema in this case.
Well, with respect, I don't think it's all that onerous to add emptydicts or emptynones.
I understand that Map with all Optionals might imply that an empty dict is possible (and that could be confusing), but I need some way to resolve the natural ambiguities while maintaining symmetric parse/serialization reflexivity.
Am open to other suggestions as to how to square this circle, but it needs to work well and simply in all cases and not just this one.
Hmm, so you're saying that all of the Empty* schemas were built with empty strings in mind, rather than e.g. the dict which is returned being empty. That makes a bit more sense.
Precisely. Since there's no clear one to one relationship between empty dict and YAML you have to use the schema to nail down what it should actually be - is empty dict the same as not having a key? Or is it having a key with an empty string? Or, is it just meaningful at all?
This discussion clearly indicates that the docs could be a bit clearer on this point, however.
Absolutely, but there's no need to do so at the cost of convenience when developing with StrictYAML. My main issue with usability is that Map({...}) refuses to validate {}, e.g. when the Map is the value attached to an Optional('key_to_map', {}), and the Map has all optional keys:
So, a schema is used not only to parse StrictYAML to data but also to serialize data to StrictYAML. There's is, by necessity, a symmetric reflexive relation here. Any optional parameter has to be something that could unambiguously be serialized into StrictYAML and parsed back again with no loss of meaning or data. Should an empty dict be serialized as a key with an empty string? No key at all? Or is it simply invalid? Should an empty string be parsed as a dict? Should it be an empty string? None?
This is what having a schema nails down - and nails it down in such a way that subtle bugs don't creep in (e.g. erroneously getting back None where you actually expected an empty dict). It might seem like onerous work up front but the strictness is supposed to protect you from subtle bugs later on and give you absolute confidence in what you're parsing or serializing, as well as the ability to trivially and safely roundtrip data.
If you don't want to support a blank string in this case, fine, but I don't think it should be necessary to add EmptyDict to the schema in this case.
Well, I don't think it's all that onerous to add emptydicts or emptynones. I have to do this too, of course, but I've never had to do it so many times that it became an issue for me.
I understand that Map with all Optionals might imply that an empty dict is possible (and that could be confusing), but I need some way to resolve the natural ambiguities while maintaining symmetric parse/serialization reflexivity.
Am open to other suggestions as to how to square this circle, but it needs to work well, unambiguously and simply in all cases and not just this one.
So, a schema is used not only to parse StrictYAML to data but also to serialize data to StrictYAML. There's is, by necessity, a symmetric reflexive relation here. Any optional parameter has to be something that could unambiguously be serialized into StrictYAML and parsed back again with no loss of meaning or data
Even with your longer description, I completely fail to see why Seq
cannot serialize an empty list. After all, I would presume that strictyaml
assumes that the same schema is used for loading what it saves. So, if Seq
should treat empty strings the same way EmptyList
does, that behavior is consistent for both saving and loading. Same with Map
and EmptyDict
. The only remaining issue would be that strictyaml
would need to remember if the key existed in the file as loaded, but as far as I can tell, you already have that behavior implemented for e.g. comments. Could add a similar bit of metadata for these fields.
Well, with respect, I don't think it's all that onerous to add emptydicts or emptynones.
The problem is that you're forcing users of strictyaml
to be aware of this. Yes, it's not too bad, but on the other hand it's a real hurdle for newcomers (have to trawl through the docs and learn about EmptyDict
/ EmptyList
).
I’ve admittedly only skimmed this thread but I would like to echo point 2 made by @wwoods in the initial comment: Having non-provided default values be missing in the YAML mapping is … inconvenient.
Effectively this means that my code is littered with the equivalent of
name = data['name'] if 'name' in data else None
It’s not tragic but it is unnecessary clutter. In particular it means that I can’t use values from the mapping directly in expressions, I first have to assign them to a local variable. For instance, I’d like to be able to write (actual code)
if not test['disabled'] or test['name'] in self.include_tests:
self.enable(test)
However, this crashes. Instead I’d have to write
if not test['disabled'] or ('name' in test and test['name'] in self.include_tests):
self.enable(test)
@klmr just a tiny suggestion that
name = data['name'] if 'name' in data else None
can easily be replaced with
name = data.get('name')
which makes the syntactic burden of this feature of strictyaml
easier to deal with.
The referenced issue above fixed this problem for me. By specifying the schema like this:
schema = Map({
Optional("endpoint_url", drop_if_none=False): EmptyNone() | Str()
})
It sets the value to None
instead of dropping the key completely.