lisp-rs icon indicating copy to clipboard operation
lisp-rs copied to clipboard

Some things that crash or give the wrong result

Open NicholasSterling opened this issue 3 years ago • 10 comments

I love this project, and very much appreciate the effort that went into it. Kudos!

I'm not sure whether this is intended for actual use or just a simple teaching exercise, but I thought I'd mention a few things that failed (for some definition of "fail", e.g. crash or do something unexpected).

12 -- expects an outer set of parens

(4) -- evals to (4), not 4; (+ 3 1) evals to 4

(print "foo) -- prints foo)

(") -- evals to ())

Are you open to pull requests? There are also a few things that could be simplified.

NicholasSterling avatar May 13 '22 19:05 NicholasSterling

@NicholasSterling, please create a pull request against the main branch.

vishpat avatar May 19 '22 23:05 vishpat

Done. The PR does not attempt to fix any of the errors, just suggested simplifications.

By the way, the lexer converts the UTF-8 String to a Vec and then processes the characters one at a time, from the beginning, removing each character as it goes. Removing a character from the front of a Vec requires moving the rest of them, so processing an N-char input is N^2.

NicholasSterling avatar May 22 '22 18:05 NicholasSterling

@vishpat I am pretty sure you can fix the N^2 lexing by getting an Iterator for the input string and calling peekable() so that you can peek at the next element. I could do this for you, but since my other PR is just sitting there and you are making further changes to the lexer, I'll wait for you to resolve all of that.

If you decide to do this yourself, I'm suggesting getting rid of the Vec in tokenize() and doing

let mut iter = input.chars().peekable();

Then when you need characters, you can do

while let Some(ch) = iter.next() { ... } or just peek at the next char withiter.peek().

If you want me to do it, just give me the all-clear when the lexer dust has settled.

NicholasSterling avatar May 24 '22 00:05 NicholasSterling

(4) should throw an error, applying nonfunction.

jcubic avatar May 31 '22 12:05 jcubic

@NicholasSterling consider the lexer dust settled. Please open a PR

vishpat avatar May 31 '22 16:05 vishpat

@vishpat OK, thanks -- will do so by the end of the weekend.

NicholasSterling avatar Jun 02 '22 05:06 NicholasSterling

OK, that was too optimistic. I'll give an update in a few days.

NicholasSterling avatar Jun 06 '22 06:06 NicholasSterling

I'm overdue for an update. I decided to do something more sophisticated, with these goals:

  1. Fix the N^2 problem.
  2. Only allocate memory when necessary (e.g. not for keywords or numbers).
  3. Make line and column number info available, for diagnostics.

I looked in crates.io for something that would help, and Tokenator looked promising, but it looked a bit clumsy to use, so I decided to do it a bit differently. I have been working on a new crate that helps one write tokenizers. It appears to be working, but I need to do more testing.

NicholasSterling avatar Jun 20 '22 00:06 NicholasSterling

@vishpat After much thrashing over the API, the tokenizer crate I have been working on (token-iter) is ready. If you are OK with it, I would like to make lisp-rs use it, let you check it out, and then publish the crate. If you would rather not use a separate crate for tokenization, I understand, but please let me know either way.

https://github.com/NicholasSterling/token-iter/blob/main/src/lib.rs

NicholasSterling avatar Jul 07 '22 19:07 NicholasSterling

@vishpat Have you had a chance to look at the token-iter crate? Are you OK with depending on an external crate for lexical analysis? https://docs.rs/token-iter/latest/token_iter/

NicholasSterling avatar Jul 28 '22 02:07 NicholasSterling