saltwater icon indicating copy to clipboard operation
saltwater copied to clipboard

[ICE] unput doesn't play well with consume_whitespace

Open jyn514 opened this issue 4 years ago • 1 comments

Code

~~Note the error message is from a local copy, I plan to make a PR soon.~~ Merged in #362

#i ""
/b
The application panicked (crashed).
Message:  unputting '\n' would cause the lexer to forget it saw 'b' (current is '/')
Location: src/lex/mod.rs:153

Expected behavior

The lexer should output the tokens Hash, Id("i"), Str(""), Slash, and Id("b"). Additionally, seen_line_token should be set appropriately at all times.

The reason this is hard is because I only want to peek ahead 2 characters, but at the same time the preprocessor needs to know where newlines occur. I think the real fix will be to implement \n as a token (#356).

Extended description (from discord):

The way the lexer works is it's streaming, it looks at one byte at a time. Sometimes it needs to look at multiple, but it doesn't have a buffer to store them in, so instead it uses current for one byte ahead and lookahead for 2 bytes ahead. The issue is it's trying to remember 3 bytes when it only has space for 2. The culprit is consume_whitespace, it calls peek_next when it sees / which sets lookahead. The preprocessor cares about newlines, even though the lexer doesn't, so the lexer still needs to keep track of '\n', which involves setting seen_line_token. As a result parse_string has to unput a newline after consume_whitespace gets rid of it in order for seen_line_token to get appropriately.

See https://github.com/jyn514/rcc/blob/master/src/lex/mod.rs#L610 for more details.

Backtrace
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ BACKTRACE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
                          (5 post panic frames hidden)                          
 5: rcc::lex::Lexer::unput::h70bde51f6a37bfb0
    at src/lex/mod.rs:153
 6: rcc::lex::Lexer::parse_string::hce2d12999d596f68
    at src/lex/mod.rs:619
 7: <rcc::lex::Lexer as core::iter::traits::iterator::Iterator>::next::{{closure}}::hb2227ef3fa992171
    at src/lex/mod.rs:845
 8: core::option::Option<T>::and_then::hce18bbfb15e478ce
    at /rustc/b8cedc00407a4c56a3bda1ed605c6fc166655447/src/libcore/option.rs:658
 9: <rcc::lex::Lexer as core::iter::traits::iterator::Iterator>::next::hc5dcf9be976562d5
    at src/lex/mod.rs:663
10: rcc::lex::cpp::PreProcessor::next_cpp_token::heb246ff3504f916b
    at src/lex/cpp.rs:431
11: <rcc::lex::cpp::PreProcessor as core::iter::traits::iterator::Iterator>::next::h3eb29428f711ff56
    at src/lex/cpp.rs:204
12: rcc::check_semantics::h763a0909e720b40a
    at src/lib.rs:169
13: rcc::compile::h39be05ac3eaedfb7
    at /home/joshua/src/rust/rcc/rcc/src/lib.rs:217
14: rcc::aot_main::h79a0fdb40de30e55
    at src/main.rs:137
15: rcc::real_main::hb7d5482550942e15
    at src/main.rs:125
16: rcc::main::hd4a33063d9febd1f
    at src/main.rs:203
                        (12 runtime init frames hidden)

jyn514 avatar Mar 30 '20 03:03 jyn514

This is not fixed by #437 which closes #356.

hdamron17 avatar May 25 '20 04:05 hdamron17