WireViz icon indicating copy to clipboard operation
WireViz copied to clipboard

[internal] Switch from PyYAML to Ruamel?

Open 17o2 opened this issue 4 years ago • 9 comments

This would add support for YAML 1.2 and solve #213. Meanwhile, YAML 1.2 support seems to still be an open issue for PyYAML.

Any opinions on this matter?

17o2 avatar Mar 20 '21 10:03 17o2

Referring to the discussion in #305:

It seems to be how the current YAML parser is intended to work. It might change in the future, e.g. if deciding to use a parser supporting YAML version 1.2 - where the unquoted words Yes/No and On/Off are no longer interpreted as aliases for the boolean values true/false. See also the related issues #213 and #223.

To override how to interpret the type of these unquoted words, will have these effects:

  • It's a global change and thus applies to all YAML input.
  • The WireViz input language no longer conforms to the YAML 1.1 specification.

[...]

@EvilMustacheTwirl is making a good point. Interpreting everything as a string could help fix some issues. Numbers and Booleans should be evaluated only where the WireViz syntax requires this. Would a simple regex replacement hack possibly also do the job?

I realize that the above will not allow linking the YAML input to line numbers. The YAML parser replacement seems essential for linking the error output to the input file. Based on https://github.com/wireviz/WireViz/issues/207#issuecomment-2116255383 request for machine-readable output, the replacement would be beneficial for https://github.com/nanangp/vscode-wireviz-preview/issues/7

martinrieder avatar May 17 '24 08:05 martinrieder

I did a bit of research and found that https://github.com/nexB/saneyaml might be lower hanging fruit than Ruamel.

EDIT: two further promising options are https://github.com/crdoconnor/strictyaml https://github.com/jcrist/msgspec

martinrieder avatar May 17 '24 19:05 martinrieder

For anyone who wants to dive a bit deeper into the topic, I recommend reading the following document:

https://eemeli.org/yaml/#parsing-yaml

This comes from the library that is used to provide syntax highlighting and error checking in https://github.com/redhat-developer/YAML-language-server for VScode. It is not programmed in Python, but we might take advantage of the CLI that it provides: It relies on parsing YAML into a CST as an intermediate step to provide an AST. This is a unique approach that other libraries do not take, but seems to be just what we need here.

Side note: Graphviz dot is also supported in VScode through https://github.com/nikeee/dot-language-server

martinrieder avatar May 21 '24 08:05 martinrieder

To enable comparing the alternatives, we should list the goals we want to achieve by switching library. Unless all goals can be achieved, then I guess we also must rank the goals by importance. A few goal candidates:

  • Support all YAML features used by WireViz
  • Resolve issues #213, #220, #297, #305, #307
  • Avoid too much code redesign

kvid avatar May 21 '24 11:05 kvid

Please extend the list of criteria with:

  • dependency management might be easier whether it shall rely on another Python lib or some external tools
  • matching schema validation (per https://github.com/wireviz/WireViz/issues/348#issuecomment-2138310959)
  • https://github.com/wireviz/WireViz/issues/19

martinrieder avatar May 30 '24 07:05 martinrieder

The parser might need to provide default values other than none if some attributes are missing. See the proposed use case here: https://github.com/wireviz/WireViz/issues/257#issuecomment-2151346345

There is need to have a dummy table: It would be less confusing if the whitespace was internally set as a default. If users wanted to override this, they would have to define none explicitly.

Doing it vice-versa might be the better option. It just needs to assure that an empty string is not equal to none.

Another use case is described in https://github.com/wireviz/WireViz/issues/273#issuecomment-2152948615 about leaving some attributes undefined if they can be copied from connected wires.

martinrieder avatar Jun 06 '24 03:06 martinrieder

I would like to add another library with some interesting features to the list that should be evaluated: fabiocaccamo/python-benedict

martinrieder avatar Jun 14 '24 19:06 martinrieder

Note that PyYAML is making some progress towards YAML spec 1.2 support. A separate yamlcore package has been released on PyPi to enable compatibility, as announced here: https://github.com/yaml/pyyaml/pull/555#issuecomment-2067745960

Currently it supports enabling all YAML 1.2 Core Schema tags on top of the PyYAML BaseLoader. It does not (yet) support other tags like the << merge key. You can add custom constructors, though. For more information see the comparison of 1.1 and 1.2 schemas.

martinrieder avatar Jun 22 '24 00:06 martinrieder

+1 for this. It seems like it should be an easy switch to make (swapping out the library). It's been years and PyYAML still hasn't made the switch.

DeflateAwning avatar Sep 21 '25 02:09 DeflateAwning