cson-parser icon indicating copy to clipboard operation
cson-parser copied to clipboard

Formal definition of CSON

Open hildjj opened this issue 10 years ago • 23 comments

Moving https://github.com/bevry/cson/issues/38 over to this project.

JSON suffered from being too tied to the JavaScript programming language early on. I suggest a document describing the format being parsed, so that other interoperable implementations can be built.

The key here is "interoperable". I want to write CSON parsers in C, Python, etc. that don't have the assumptions of ECMAscript (particularly with respect to duplicate keys, strings, and numbers) baked in.

hildjj avatar Feb 09 '15 16:02 hildjj

Thanks! This sounds like a great idea. My biggest concern would be us breaking existing CSON files out there (that rely on all the dirty hacks CoffeeScript allows).

jkrems avatar Feb 09 '15 17:02 jkrems

It's better to break things earlier than later. I would suggest starting by declaring what you think the format is, then dealing with the edge cases as they are reported. It's not going to get easier as CoffeeScript evolves.

hildjj avatar Feb 09 '15 20:02 hildjj

Yeah, without a spec, CSON parser libraries written for other languages are extremely unlikely to accept the same set of inputs and produce the same outputs. A file written for one will break when parsed by another. (I've played around with CoffeeScript a little, and heck if I can tell you how the compiler will parse a given input.) You'll have Markdown all over again.

dgreensp avatar Mar 05 '15 17:03 dgreensp

Just a thought, how about the spec being written in literate coffeescript with executable test cases?

ghost avatar Mar 05 '15 18:03 ghost

If we add a spec, I would think more along the lines of a grammar, e.g. a PEG. That way we could also implement cson-parser in terms of that spec. And PEG should be reasonably portable so that it's easy to consume/port to other languages.

jkrems avatar Mar 05 '15 18:03 jkrems

:+1:

ghost avatar Mar 05 '15 18:03 ghost

The gold standard for a spec is, of course, the JSON spec (http://www.json.org/), lest anyone think a spec must be a long, stuffy document. All a spec has to do is communicate the language to a human implementer, and be unambiguous. A PEG works as long as it is sufficiently human-readable.

dgreensp avatar Mar 05 '15 18:03 dgreensp

The JSON spec format works for a simple, straight-forward data format like JSON. I doubt it will still be nice and understandable when it meets the monstrosity that is CoffeeScript syntax. ;) I'm not 100% convinced that implying CSON is a viable data interchange format is doing any good. Especially since CSON supports operations that are pretty tightly coupled to JavaScript floating point semantics etc.. That's not an argument against properly spec'ing how it looks like - just against adding wording to the docs suggesting that it's a good idea to use it across stacks instead of JSON or YAML.

jkrems avatar Mar 05 '15 22:03 jkrems

So, you're saying that if I don't have a CoffeeScript parser handy in my language, I should use YAML. (JSON doesn't have comments) I'll accept that, stop complaining, and leave you to your much smaller corner of the Internet than you could have had.

hildjj avatar Mar 06 '15 17:03 hildjj

I still believe this is worth doing. Sorry if my previous comment was misleading in that regard.

jkrems avatar Mar 06 '15 18:03 jkrems

Atom stores configuration data on disk in CSON, so that's what got me interested in whether this was a "real" format (that could conceivably be read by an arbitrary program) or not. That said, Emacs configuration is stored in the form of ELisp programs, and it's still my favorite editor. :)

dgreensp avatar Mar 06 '15 18:03 dgreensp

By the way, it looks like CSON is a superset of JSON.

tomek-he-him avatar Apr 24 '15 20:04 tomek-he-him

I'd add an "(*) well-formatted JSON". But yes, definitely worth mentioning in a potential spec.

jkrems avatar Apr 24 '15 20:04 jkrems

+1

jmatsushita avatar May 27 '15 10:05 jmatsushita

I just found CSON, and it looks interesting. I am one of the authors of a JSON library, but our architecture allows us to parse and serialize other formats as long as the internal data model is sufficiently similar. CSON seems to fit that bill.

As we are a C++ library, any kind of reference to Coffee-Script (or a reference implementation written in CS) is mostly useless to us. I can only repeat and stress the importance of having a real specification in an implementation-language-agnostic way.

We are also not just the authors of a JSON library, but for parsing we are also using our own PEG parser library, the PEGTL. I have some experience with writing extended JSON grammars, e.g. we are just about to define a standard for "relaxed JSON", calling it JAXN.

I'd like to see if CSON is another candidate for our library and if there is interest from your side to come up with a more formal specification for it. If you could write a (complete) list of features that CSON should have (being a sub-set of Coffee-Script), I could try to come up with a PEG or a CFG for it (similar to the actual JSON grammar from RFC 7159).

I'd like to co-operate on this, but I would also like to avoid wasting each other's time in case we can not agree on some common goals. Please let me hear your thoughts about this and whether you can see CSON becoming a Coffee-Script independent, self-contained standard (which can still be a sub-set of Coffee-Script, that is not a problem).

d-frey avatar Jun 26 '17 21:06 d-frey

I know that @dbushong spent some time wring a PEG (?) for CSON while trying to migrate away from our dependency on coffee-script. I'm not 100% sure where exactly he ran into problems. I think it was something about the fairly liberal whitespace handling in coffee-script..?

jkrems avatar Jun 26 '17 21:06 jkrems

Yeah, lemme see if I can find my work thus far and stick it somewhere.

dbushong avatar Jun 26 '17 21:06 dbushong

https://github.com/groupon/cson-parser/blob/dpb-native-parser/src/cson.pegjs

There's what I've got thus far. The issues I ran into were, unsurprisingly, around corner cases in object tree parsing. In certain cases (I'll try to dig up a repro) exdented objects are incorrectly parsed as part of the preceding object.

dbushong avatar Jun 26 '17 21:06 dbushong

A grammar will usually be only a starting point, additional rules will apply. This is even the case for the JSON grammar itself. I'll check out the grammar you wrote/linked and report back when I had some more time for it... thanks so far.

d-frey avatar Jun 26 '17 21:06 d-frey

I just want a quick way to know how to include '' in a string value. There does not seem to be a simple here-are-all-the-rules document for this, or I am missing it. Seems to me maybe that's more appropriate an issue to bring up at bevry/cson#38 - but that issue, of course, led me here.

refactorized avatar May 30 '18 19:05 refactorized

I believe CSON accepts "..." or '...', so you should be able to say foo: "this 'and' that"

You also should be able to \ things, so even foo: 'this \'and\' that'

dbushong avatar May 30 '18 20:05 dbushong

Hey guyes, i want to use CSON in a flutter/dart project (due to interoperability with legacy code). Unfortunately, there is no dart parser for CSON. Furthermore, without any written specification, it's hard to write a parser on myself.. Any ideas what to do? Do you, by chance, know about any dart CSON parser?

ehhc avatar Nov 13 '19 12:11 ehhc

This thread is about defining a spec that could be parsed with a PEG grammar. I started and abandoned defining this a while back, but currently the spec is "what this version of coffeescript + this library can parse" - sorry

dbushong avatar Nov 14 '19 18:11 dbushong