strictyaml icon indicating copy to clipboard operation
strictyaml copied to clipboard

Bug: StrictYaml cannot serialize `None` (null value)

Open cowlinator opened this issue 5 years ago • 20 comments

Typing the following into the python interpreter

Python 3.7.1 (v3.7.1:260ec2c36a, Oct 20 2018, 14:05:16) [MSC v.1915 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import strictyaml
>>> yaml = strictyaml.as_document({'a':None})

results in

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\Temp\3\venv3\lib\site-packages\strictyaml\parser.py", line 258, in as_document
    return schema(YAMLChunk(schema.to_yaml(data), label=label))
  File "D:\Temp\3\venv3\lib\site-packages\strictyaml\any_validator.py", line 45, in to_yaml
    return schema_from_data(data).to_yaml(data)
  File "D:\Temp\3\venv3\lib\site-packages\strictyaml\compound.py", line 197, in to_yaml
    for key, value in data.items()
  File "D:\Temp\3\venv3\lib\site-packages\strictyaml\compound.py", line 200, in <listcomp>
    and value != self._defaults[key]
  File "D:\Temp\3\venv3\lib\site-packages\strictyaml\scalar.py", line 148, in to_yaml
    raise YAMLSerializationError("'{}' is not a string".format(data))
strictyaml.exceptions.YAMLSerializationError: 'None' is not a string

This appears to be a bug. (If it is not a bug, this gaping feature-hole is not documented anywhere.)

cowlinator avatar May 23 '19 01:05 cowlinator

It doesn't say it explicitly, but the answer is shown in various examples. The solution is usually:

>>> import strictyaml
>>> yaml = strictyaml.as_document({'a': 'null'})

Alternatively:

>>> import strictyaml
>>> yaml = strictyaml.as_document({'a': '', 'a_is_null': 'true'})

or

>>> import strictyaml
>>> yaml = strictyaml.as_document({})   # a is missing, and therefore implied null

Or you could set the string to "none" or "None". Or, whatever you feel is most appropriate.

The key is that assigning type is a function of the application, not the strictyaml document.

To word that in a more pythonic manner: strictyaml only does strings. That is why schemas are often used by the libraries to aid in translation of those strings to anything else.

JohnAD avatar May 23 '19 02:05 JohnAD

I'm not sure how I could ever guarantee that the string value never happens to actually be "null", so I guess the 3rd option makes the most sense. Unfortunately, this requires a schema.

Thank you for explaining this. I would like to see this added to the documentation explicitly, since this seems like a common data value, and I can easily imagine others running into this issue.

cowlinator avatar May 23 '19 18:05 cowlinator

Thanks @JohnAD you summed it up perfectly: strictyaml only does strings, dicts and lists by default. anything else requires disambiguation via schema.

I'll add this to the documentation explicitly - I agree it's not as clear as it could be.

crdoconnor avatar Jun 02 '19 17:06 crdoconnor

So how do I specify a (optional) null in the schema so that I get a None? I thought about using Optional('a', None): Str() but it still returns "null". EmptyNone() expects the value to be missing, which is also not what I want.

ArneBachmannDLR avatar Apr 22 '20 13:04 ArneBachmannDLR

Here is my workaround:

from strictyaml.scalar import ScalarValidator

class None_(ScalarValidator):
  ''' Validator for strictyaml schemas expecting a None. '''

  def validate_scalar(_, chunk):
    if chunk.contents == 'null': return None
    chunk.expecting_but_found("when expecting None")

  def to_yaml(_, data): return 'null'

Make sure to put None_() before | Str() or other unions.

ArneBachmannDLR avatar Apr 22 '20 13:04 ArneBachmannDLR

I can add a NoneValidator which will accept values other than empty (defaulting to null) if that helps?

On Wed, 22 Apr 2020, 14:25 ArneBachmannDLR, [email protected] wrote:

Here is my workaround:

from strictyaml.scalar import ScalarValidator class None_(ScalarValidator): ''' Validator for strictyaml schemas expecting a None. '''

def validate_scalar(_, chunk): if chunk.contents == 'null': return None chunk.expecting_but_found("when expecting None")

def to_yaml(_, data): return 'null'

Make sure to put None_() before | Str() or other unions.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/crdoconnor/strictyaml/issues/57#issuecomment-617775873, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOJKNOU3QRCL5KE7VIEAGLRN3V5JANCNFSM4HOZFTOQ .

crdoconnor avatar Apr 22 '20 14:04 crdoconnor

I think that makes sense as a part of the project as a whole actually.

On Wed, 22 Apr 2020, 15:34 Colm O'Connor, [email protected] wrote:

I can add a NoneValidator which will accept values other than empty (defaulting to null) if that helps?

On Wed, 22 Apr 2020, 14:25 ArneBachmannDLR, [email protected] wrote:

Here is my workaround:

from strictyaml.scalar import ScalarValidator class None_(ScalarValidator): ''' Validator for strictyaml schemas expecting a None. '''

def validate_scalar(_, chunk): if chunk.contents == 'null': return None chunk.expecting_but_found("when expecting None")

def to_yaml(_, data): return 'null'

Make sure to put None_() before | Str() or other unions.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/crdoconnor/strictyaml/issues/57#issuecomment-617775873, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOJKNOU3QRCL5KE7VIEAGLRN3V5JANCNFSM4HOZFTOQ .

crdoconnor avatar Apr 22 '20 14:04 crdoconnor

I think it would be useful to define a value to be optionally None (YAML's null). The problem is that None is already taken in Python, so that None() is not available, therefore I went with None_(). An alternative might be NoneOr(Type()) or Nullable(Type).

ArneBachmannDLR avatar Apr 22 '20 14:04 ArneBachmannDLR

Agreed.

On Wed, 22 Apr 2020, 15:59 ArneBachmannDLR, [email protected] wrote:

I think it would be useful to define a value to be optionally None (YAML's null). The problem is that None is already taken in Python, so that None() is not available, therefore I went with None_(). An alternative might be NoneOr(Type()) or Nullable(Type).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/crdoconnor/strictyaml/issues/57#issuecomment-617832193, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOJKNIE3BKYB7JGSJG77HDRN4A45ANCNFSM4HOZFTOQ .

crdoconnor avatar Apr 22 '20 17:04 crdoconnor

Hi! Any updates on this? Do we have a built-in validator for "null" -> None or should we extend ScalarValidator as Arne has shown? (thanks btw). Maybe Null is a meaningful name as well, since it is not protected?

andres-fr avatar Feb 14 '21 02:02 andres-fr

This is what has been working for me so far, hope it helps!


# see https://yaml.org/spec/1.2/spec.html#id2805071
YAML_NULL = ["null", "Null", "NULL", "~"]

class Null(ScalarValidator):
    """
    'null'->None
    """
    @staticmethod
    def validate_scalar(chunk):
        """
        """
        if any([chunk.contents == n for n in YAML_NULL]):
            return None
        chunk.expecting_but_found(f"when expecting any of {YAML_NULL}")


class NotNullStr(Str):
    """
    """
    @staticmethod
    def validate_scalar(chunk):
        """
        """
        if any([chunk.contents == n for n in YAML_NULL]):
            chunk.expecting_but_found(
                f"when expecting a string different to these: {YAML_NULL}")
        return chunk.contents

Then the schema can look like this:

Map({"code": NotNullStr(),
     "name": Null() | Str(),
     ...

andres-fr avatar Feb 16 '21 17:02 andres-fr

I've added NullNone as a new validator that parses "null" to None and serializes None to "null" and release tomorrow. Apologies I dropped the ball on this one.

crdoconnor avatar Feb 16 '21 22:02 crdoconnor

That's great thanks! can you link the commit here for reference upon release? Also regarding docs, I'd emphasize the idea that NullNone has to come before Str to prevent getting "null" instead of None

andres-fr avatar Feb 16 '21 22:02 andres-fr

Great progress! Looking forward to the release.

BTW naming: NullNone is a very explicit term but doesn't mirror NotNullStr very well. Can't we stick to Null? It's very clear and fits better the NotNullTypes in my opinion.

ArneBachmann avatar Feb 18 '21 09:02 ArneBachmann

I wanted it to to mirror the existing EmptyNone / EmptyDict / EmptyList validates.

I don't plan on having a NotNull type. Should I?

On Thu, 18 Feb 2021, 09:46 Arne Bachmann, [email protected] wrote:

Great progress! Looking forward to the release.

BTW naming: NullNone is a very explicit term but doesn't mirror NotNullStr very well. Can't we stick to Null? It's very clear and fits better the NotNullTypes in my opinion.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/crdoconnor/strictyaml/issues/57#issuecomment-781219014, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOJKNI2RMRKDGQVWDNIUT3S7TOV3ANCNFSM4HOZFTOQ .

crdoconnor avatar Feb 18 '21 12:02 crdoconnor

I don't plan on having a NotNull type. Should I?

I think only NotNullStr is needed due to default Str behavior. IMO in most cases it is crucial to know when a str field is expected to not be null

andres-fr avatar Feb 18 '21 17:02 andres-fr

What's the use case you're aiming for with this? Parsing YAML 1.2?

On Thu, 18 Feb 2021, 17:45 Andres Fernandez, [email protected] wrote:

I don't plan on having a NotNull type. Should I?

I think only NotNullStr is needed due to default Str behavior. IMO in most cases it is crucial to know when a str field is expected to not be null

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/crdoconnor/strictyaml/issues/57#issuecomment-781521020, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOJKNNUWAP7CSXM7DPFG6DS7VG4PANCNFSM4HOZFTOQ .

crdoconnor avatar Feb 18 '21 17:02 crdoconnor

I got a yaml database of unknown version and want to infer the schema. The sensible thing to do for Str is to assume notnull if I can't find such examples. I don't want to risk parsing straw nulls as Str.

Many reasons for this: avoid redundancies, ensure operability... it is IMO basic to data curation and the main reason why you would use a schema in the first place

andres-fr avatar Feb 18 '21 18:02 andres-fr

Also for the record I'm using this wrapper on CommaSeparated, Enum. For the same reasons as above I found it really useful to combine them and have a strong grip on the accepted options for a comma-separated list.

Not saying this should be added to the API, though. Also, if you know another open issue where this could apply I'll be happy to move the comment there.

Thanks again!

class TagList(CommaSeparated):
    """
    An alias for ``CommaSeparated(Enum([tag1, tag2, ...]))``
    """

    def __init__(self, *tags):
        """
        """
        assert all(isinstance(t, str) for t in tags), \
            f"{self.__class__.__name__} admits string tags only!"
        super().__init__(Enum(tags))

Usage: In the following example we expect the drinks field to contain either nothing, or a list of comma-separated elements, where each element can be only water, beer or juice.

Map({"code": NotNullStr(),
         "name": Null() | Str(),
         "drinks": TagList("water", "beer", "juice"),
     ...

andres-fr avatar Feb 18 '21 20:02 andres-fr

I wanted it to to mirror the existing EmptyNone / EmptyDict / EmptyList validates. I don't plan on having a NotNull type. Should I?

I don't think NotNull makes sense. Regarding the EmptyDict I always found it hard to grasp (if empty then dict? accept empty or dict? if not defined then interpret as dict?). But following that scheme NullNone makes sense. Either naming decision is valid.

ArneBachmann avatar Feb 19 '21 08:02 ArneBachmann