convex icon indicating copy to clipboard operation
convex copied to clipboard

Reader should not accept extra parens

Open helins opened this issue 4 years ago • 7 comments

It seems extra ) are ignored:

'(42))))))

;; => '(42)

helins avatar Jul 21 '21 19:07 helins

Presume this is the sandbox? I think this is an issue with readAll multi-form input. It should be fixed by adding a new grammar rule that explicitly matches EOF after multiple forms.

mikera avatar Jul 22 '21 00:07 mikera

I noticed it in my runner but indeed, I am using .readAll as well.

Indirect but not unrelated question: shouldn't .read stop when one form has been read? This is very desirable for reading successively from an InputStream like STDIN or a Reader. Currently, the following throws because it goes further than it needs to:

(AntlrReader/read ":foo [:bar") ;; Everything after :foo should be irrelevant

Concrete use case: my runner reading Convex forms one by one from STDIN. Cannot stream, must read everything in one go.

helins avatar Jul 22 '21 16:07 helins

We could create a different grammar rule for each of the use cases:

  • Consume first form, ignore everything else
  • Consume entire input as sequence of forms
  • Consume entire input as single form

Would that work for you? Should just be one line each in the grammar

mikera avatar Jul 23 '21 06:07 mikera

Current .readAll effectively producing a list of forms and it is useful. There are cases where I need to read everything at once (eg. reading a file containing several forms).

The problem is rather that .read tries to consume entire input whereas consuming first (and ignoring rest) is more flexible. This is what the Clojure reader does it seems. And it works great with streams since you can work on a form per form basis.

helins avatar Jul 23 '21 12:07 helins

If you have an idea on how to quickly fix .read, it would really improve interactivity in the runner :)

helins avatar Jul 24 '21 06:07 helins

I tried simply removing EOF after single form.

It worked but:

a) Introduced a couple of regressions. Eg: 12e2.42 would parse 12e2 on read and (12e2 .42)' on readAll (instead of failing). b) Consuming from a stream (OutputStream or Reader) with read is eager: only one form is parsed but the stream is closed. The idea is to able to consume from the stream one form at a time.

Solutions:

a) Generally enforce that forms must be separated by a white space? b) No idea, frankly, I tried to google it but I didn't find much. However it's a very common requirement in reality, I'm just too much of an ANTLR noob to find the answer I think. Maybe you'll know where to start?

helins avatar Jul 25 '21 07:07 helins

I think this is an artefact of how ANTLR divides the stream up into tokens. It's probably possible to add a special case to catch 12e2.42 (i.e. always parse this as a single token.

Removing EOF after a single form seems like a bad idea, read should not succeed on input like 1 2.

mikera avatar Aug 02 '21 08:08 mikera