toml icon indicating copy to clipboard operation
toml copied to clipboard

Stack overflow when parsing a deeply nested toml_edit::Document

Open 5225225 opened this issue 3 years ago • 8 comments

Basically the same error as seen as https://github.com/alexcrichton/toml-rs/issues/428

Reproduction code:

fn main() {
    let brackets = "[".repeat(20000);
    let input_string = format!("x={}", &brackets);
    input_string.parse::<toml_edit::Document>().unwrap();
}

5225225 avatar Sep 13 '21 21:09 5225225

I assume this is also true for { (inline tables).

Likewise, it'd take longer but I assume we'd also hit OOM for many dotted keys, many tables, and many key-values.

For dotted keys, I'm actually unsure whether we would hit OOM (too many keys for memory) or stack overflow (recursively walking data structures) first.

epage avatar Sep 13 '21 22:09 epage

The issue is about stackoverflow, not OOM.

A solution to this problem is to use an explicit stack (using a vec) instead of recursion.

ordian avatar Sep 14 '21 08:09 ordian

I brought up OOM because

  • Stackoverflow is a special case of OOM
  • Because we generally don't handle OOM errors in Rust, they lead to the same behavior (crash)
  • Changing recursion to explicit stack just converts the problem from stackoverflow to OOM, it will just take longer to crash without the stack frame boilerplate

The main reason I can think of to handle stackoverflows specially is to provide a good error on accidental infinite recursion so people can get out of that case, which is not applicable here.

We can easily change descend-path from recursion but parsing of inline tables and arrays will be more difficult.

epage avatar Sep 14 '21 13:09 epage

Well, by OOM, people usually mean Out Of (heap) Memory. And one of the differences between the stack and the heap is that the default stack size is fixed and relatively small (8MiB), whereas the heap size is usually limited by the amount of RAM a computer has (yeah, it's more complicated than that).

So I would say, ensuring no stackoverflow is almost always lib's responsibility, as opposed to ensure no OO(heap)M.

ordian avatar Sep 14 '21 14:09 ordian

We could use something like https://github.com/rust-lang/stacker for this.

ordian avatar Jul 31 '22 15:07 ordian

Hi @ordian @epage While working on issue:https://github.com/ordian/toml_edit/issues/335 and running fuzzer locally for sometime, i found a similar stackoverflow crash. Following is the reproduction code.

#[test]
fn test_stackoverflow() {
    let testcase = "8x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[
{8x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[{8x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[
{7x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[{8x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[
[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[
[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[{8x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[{8x=[[[[[[[[[[[[[[[[[[[
[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[{8x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[
[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[
[{0x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[8x0={22={2[";
    testcase.parse::<toml_edit::Document>().unwrap();    
}
running 1 test

thread 'test_stackoverflow' has overflowed its stack
fatal runtime error: stack overflow
error: test failed, to rerun pass `--test test_stackoverflow`

Caused by:
  process didn't exit successfully: `/home/maxx/dev/security/oss-fuzz-projects/toml_edit/target/debug/deps/test_stackoverflow-e48339b99633f4c2` (signal: 6, SIGABRT: process abort signal)

Just thought of sharing this.

manunio avatar Sep 19 '22 10:09 manunio

Yes, anything that will cause too much recursion will cause an overflow.

Speaking of ways of solving this, I've been thinking of switching this to nom once some ergoomic issues are resolved. I suspect I could more easily write a heap-allocated recursive parser with it. We'll see.

epage avatar Sep 19 '22 13:09 epage

That being said, it might also be worth adding recursion limits by default, since if you have a deeply nested object, you might run into issues working with it (drop being recursive and blowing the stack without taking care, for example).

I would be surprised if any object nested more than, say, 100 deep is actually legitimate. And if someone really does have objects nested that deep, they can disable the limits and be careful themselves.

5225225 avatar Sep 19 '22 13:09 5225225