gleam
gleam copied to clipboard
Many repeated expressions cause stack overflow in parser
Repeated quotes cause stack overflow in lexer
Good day, while fuzzing the compiler with honggfuzz for a competition I came across he following stack overflow in the lexer caused by large amounts of repeated quotation marks.
A (truncated) example:
fn f() {
""""""""""""""""""""""""""" // <lots more of these...>
}
Full input file:
Steps to reproduce:
- Create a new project
- Replace the
src/<filename>.gleam
with the example file attached - Run
gleam build
:
% gleam build
Compiling fff
thread 'main' has overflowed its stack
fatal runtime error: stack overflow
zsh: abort (core dumped) gleam build
Stack trace from gdb:
► f 0 0x555555b7e04f _ZN10gleam_core5parse5lexer14Lexer$LT$T$GT$10inner_next17hb729e48a11fd24a8E.llvm.1354553687974732476+143
f 1 0x555555c0ef6d gleam_core::parse::Parser<T>::next_tok+205
f 2 0x555555bfa2aa gleam_core::parse::Parser<T>::parse_expression_unit+1226
f 3 0x555555bf98d8 gleam_core::parse::Parser<T>::parse_expression+120
f 4 0x555555bfe979 gleam_core::parse::Parser<T>::parse_expression_seq+265
f 5 0x555555bfecfd gleam_core::parse::Parser<T>::parse_expression_seq+1165
f 6 0x555555bfecfd gleam_core::parse::Parser<T>::parse_expression_seq+1165
f 7 0x555555bfecfd gleam_core::parse::Parser<T>::parse_expression_seq+1165
I'm not sure why gdb didn't demangle the current frame, but that's gleam_core::parse::lexer::Lexer<T>::inner_next
.
I have not investigated further.
Thank you. Let's fix this to return a lex error rather than crashing
After a bit of exploration I don't think this is the lexer that is the problem, it is that the parser's parse_expression_seq
recursively calls itself, causing any block or function with a very large number of statements to crash the parser.
A test to replicate
// https://github.com/gleam-lang/gleam/issues/1707
#[test]
fn many_quotes_does_not_crash_the_lexer() {
assert!(crate::parse::parse_module(&format!(
"pub fn main() {{
{}
}}",
"Nil ".repeat(1024)
))
.is_err());
}
This is fiddly to resolve currently with how we represent Try
in the AST. This may get removed in future with #1709 at which point this code can be simplified