toml
toml copied to clipboard
Stack overflow when parsing a deeply nested toml_edit::Document
Basically the same error as seen as https://github.com/alexcrichton/toml-rs/issues/428
Reproduction code:
fn main() {
let brackets = "[".repeat(20000);
let input_string = format!("x={}", &brackets);
input_string.parse::<toml_edit::Document>().unwrap();
}
I assume this is also true for {
(inline tables).
Likewise, it'd take longer but I assume we'd also hit OOM for many dotted keys, many tables, and many key-values.
For dotted keys, I'm actually unsure whether we would hit OOM (too many keys for memory) or stack overflow (recursively walking data structures) first.
The issue is about stackoverflow, not OOM.
A solution to this problem is to use an explicit stack (using a vec) instead of recursion.
I brought up OOM because
- Stackoverflow is a special case of OOM
- Because we generally don't handle OOM errors in Rust, they lead to the same behavior (crash)
- Changing recursion to explicit stack just converts the problem from stackoverflow to OOM, it will just take longer to crash without the stack frame boilerplate
The main reason I can think of to handle stackoverflows specially is to provide a good error on accidental infinite recursion so people can get out of that case, which is not applicable here.
We can easily change descend-path from recursion but parsing of inline tables and arrays will be more difficult.
Well, by OOM, people usually mean Out Of (heap) Memory. And one of the differences between the stack and the heap is that the default stack size is fixed and relatively small (8MiB), whereas the heap size is usually limited by the amount of RAM a computer has (yeah, it's more complicated than that).
So I would say, ensuring no stackoverflow is almost always lib's responsibility, as opposed to ensure no OO(heap)M.
We could use something like https://github.com/rust-lang/stacker for this.
Hi @ordian @epage While working on issue:https://github.com/ordian/toml_edit/issues/335 and running fuzzer locally for sometime, i found a similar stackoverflow crash. Following is the reproduction code.
#[test]
fn test_stackoverflow() {
let testcase = "8x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[
{8x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[{8x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[
{7x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[{8x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[
[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[
[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[{8x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[{8x=[[[[[[[[[[[[[[[[[[[
[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[{8x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[
[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[
[{0x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[8x0={22={2[";
testcase.parse::<toml_edit::Document>().unwrap();
}
running 1 test
thread 'test_stackoverflow' has overflowed its stack
fatal runtime error: stack overflow
error: test failed, to rerun pass `--test test_stackoverflow`
Caused by:
process didn't exit successfully: `/home/maxx/dev/security/oss-fuzz-projects/toml_edit/target/debug/deps/test_stackoverflow-e48339b99633f4c2` (signal: 6, SIGABRT: process abort signal)
Just thought of sharing this.
Yes, anything that will cause too much recursion will cause an overflow.
Speaking of ways of solving this, I've been thinking of switching this to nom once some ergoomic issues are resolved. I suspect I could more easily write a heap-allocated recursive parser with it. We'll see.
That being said, it might also be worth adding recursion limits by default, since if you have a deeply nested object, you might run into issues working with it (drop
being recursive and blowing the stack without taking care, for example).
I would be surprised if any object nested more than, say, 100 deep is actually legitimate. And if someone really does have objects nested that deep, they can disable the limits and be careful themselves.