strictyaml icon indicating copy to clipboard operation
strictyaml copied to clipboard

Cross platform grammar for strictyaml 2.0 / multi-language strictyaml

Open crdoconnor opened this issue 4 years ago • 7 comments

The grammar needs to support the following features:

  • Mappings with hierarchies:
a:
  c: d
  e: f
  • Lists - e.g.
a:
- 1
- 2
- 3
  • Multiline strings with |:
x: |
  This is a 
  multiline string
  • Comments (they must be parsed, not just ignored)

  • Every key/value property parsed as a string (a rule of the specification is that strictyaml will only read ordered mappings (i.e. ordered dicts), lists, strings and comments). Type casting (e.g. to int/date/float/whatever) is always the responsibility of the language itself.

# This is a comment

a: x # another comment
b: c

# This is yet another comment

c: d

It should generate a lexer/parser (with a minimum of fuss) for python, but it would also be ideal if it could be used with many other languages as well.

Other features may be needed, but anything that can do all of these would be 90% of the way there, I think.

Currently https://www.antlr.org/ seems like the best candidate for achieving all of this, since it can target java, C#, python, javascript, go, swift, C++ and PHP. However, some lexer customization may be needed

Related issues:

  • #93
  • #53

crdoconnor avatar Apr 18 '20 15:04 crdoconnor

Research so far:

  • https://www.libelektra.org/plugins/yanlr - has hand written lexer in C++ using antlr, does not parse block scalars and the listener ignores comments. The hand written lexer in C++ makes it not so reusable.

  • https://github.com/tkellogg/enyaml - C# parser for YAML using antlr - but very rudimentary. It does seem to parse block scalars. Most promising, but it's in antlr 3 and it uses features that have been deprecated in antlr 4.

  • https://stackoverflow.com/questions/7476116/yaml-parsing-lex-or-hand-rolled - lex can not be used to parse yaml :(

crdoconnor avatar Apr 18 '20 15:04 crdoconnor

  • Mappings with hierarchies:
a: b
 c: d
 e: f

So here a is both a string (storing b) and an object (storing c and e)?

If I do print(a) would I see b or would I see {'c': 'd', 'e': 'f'}?

shoogle avatar Apr 19 '20 03:04 shoogle

Whoops that was an accident. No, that's invalid. The corrected version sh, should parse as

{"a": {"c": "d", "e": "f"}}

On Sun, 19 Apr 2020, 04:56 Peter Jonas, [email protected] wrote:

  • Mappings with hierarchies:

a: b c: d e: f

So here a is both a string (storing b) and an object (storing c and e)?

If I do print(a) would I see b or would I see {'c': 'd', 'e': 'f'}?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/crdoconnor/strictyaml/issues/94#issuecomment-616022170, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOJKNKMZGX6HAVOITI4433RNJZARANCNFSM4MLMP4VA .

crdoconnor avatar Apr 19 '20 11:04 crdoconnor

Am I right in thinking the main problem is indentation of block scalars? Is this the only issue? If so, it might be worth requesting this feature from ANTLR.

If there are other issues then it might be easier just to create the specification and test suite, and then rely on others to port the implementation to other languages. StrictYAML is a very simple format (simpler than JSON even), and things like regex are pretty cross-platform anyway, so porting shouldn't be too hard.

shoogle avatar Apr 22 '20 10:04 shoogle

I think antlr has that feature but I'd like to spike it first before moving down that path.

On Wed, 22 Apr 2020, 11:55 Peter Jonas, [email protected] wrote:

Am I right in thinking the main problem is indentation of block scalars? Is this the only issue? If so, it might be worth requesting this feature from ANTLR.

If there are other issues then it might be easier just to create the specification and test suite, and then rely on others to port the implementation to other languages. StrictYAML is a very simple format (simpler than JSON even), and things like regex are pretty cross-platform anyway, so porting shouldn't be too hard.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/crdoconnor/strictyaml/issues/94#issuecomment-617705448, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOJKNL6YDSGAUEL6YHPZ2LRN3EKHANCNFSM4MLMP4VA .

crdoconnor avatar Apr 22 '20 11:04 crdoconnor

But yes, if that doesn't work I'd be happy to follow the second strategy.

fwiw I don't think the spec is as simple as JSON although it is a lot simpler than YAML.

On Wed, 22 Apr 2020, 12:00 Colm O'Connor, [email protected] wrote:

I think antlr has that feature but I'd like to spike it first before moving down that path.

On Wed, 22 Apr 2020, 11:55 Peter Jonas, [email protected] wrote:

Am I right in thinking the main problem is indentation of block scalars? Is this the only issue? If so, it might be worth requesting this feature from ANTLR.

If there are other issues then it might be easier just to create the specification and test suite, and then rely on others to port the implementation to other languages. StrictYAML is a very simple format (simpler than JSON even), and things like regex are pretty cross-platform anyway, so porting shouldn't be too hard.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/crdoconnor/strictyaml/issues/94#issuecomment-617705448, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOJKNL6YDSGAUEL6YHPZ2LRN3EKHANCNFSM4MLMP4VA .

crdoconnor avatar Apr 22 '20 11:04 crdoconnor

I also don't really want do development in antlr and hit a road block that requires raising a ticket coz they don't seem super responsive to feature requests.

On Wed, 22 Apr 2020, 12:03 Colm O'Connor, [email protected] wrote:

But yes, if that doesn't work I'd be happy to follow the second strategy.

fwiw I don't think the spec is as simple as JSON although it is a lot simpler than YAML.

On Wed, 22 Apr 2020, 12:00 Colm O'Connor, [email protected] wrote:

I think antlr has that feature but I'd like to spike it first before moving down that path.

On Wed, 22 Apr 2020, 11:55 Peter Jonas, [email protected] wrote:

Am I right in thinking the main problem is indentation of block scalars? Is this the only issue? If so, it might be worth requesting this feature from ANTLR.

If there are other issues then it might be easier just to create the specification and test suite, and then rely on others to port the implementation to other languages. StrictYAML is a very simple format (simpler than JSON even), and things like regex are pretty cross-platform anyway, so porting shouldn't be too hard.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/crdoconnor/strictyaml/issues/94#issuecomment-617705448, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOJKNL6YDSGAUEL6YHPZ2LRN3EKHANCNFSM4MLMP4VA .

crdoconnor avatar Apr 22 '20 11:04 crdoconnor