swc icon indicating copy to clipboard operation
swc copied to clipboard

There may be a problem with modestly deep recursion

Open StoneCypher opened this issue 1 year ago • 3 comments

Describe the bug

Hi. I'm the author of some trivial domain-specific language for authoring finite state machines.

I tried to port my machine's implementation to Deno, and I hit ground pretty hard. Loading the library causes deno to crash.

After some back and forth, they suggest that the problem comes from swc, specifically from that my parser recurses occasionally several hundred times, and that apparently we're hitting stack size limits.

This is actually quite common with generated parsers, and the grammar for my language is barely 40k, most of which is simple lists eg the CSS colors; the parser is not very complex at all by parser standards, and the compiled version that's taking Deno down runs happily in Netscape Navigator 4.

The Deno folks recommended that I come here, and propose one or more of the following. I am not a Rustacean, and I don't really know what I'm saying here; please try not to laugh too hard.

  1. They suggest perhaps to lower the weight of a recursion. I guess they might be talking about the RAM overhead of stack frames, in the hope that if the frames were slimmer, this would hit later? Not too clear on things here.
  2. To consider the adoption of a library called stacker, which appears to be explicitly for this purpose given this context.

They also propose, for themselves:

  1. to temporarily push the wall back by manually increasing the default size of their own stacks, which will solve my problem but not the general case

So, I'm bearing the torch. But I'm not a Deno person and as soon as you ask me questions I'm going to crumble under the lightest of loads. 😂

I'll do my best

Input code

There's 3 ways to look at this: do you want 

1) [the parser](https://github.com/StoneCypher/jssm/blob/main/src/ts/jssm-dot.peg)
2) [the compiled parser](https://github.com/StoneCypher/jssm/blob/main/dist/es6/jssm-dot.js) (warning: 270k single line of code)
3) The pretty-printed parser.  Unfortunately, since the code is 270k, the ones that put it in a URL run out of space, but https://beautifier.io/ is able to prettify that code (expect a ten second delay)

Config

No idea.  It's `Deno`.

Playground link

Repro link doesn't make sense because the code is over max-url length. However, cut and

Expected behavior

Uh. I don't really know, to be honest? I'm not your user here. The Deno developers are.

My understanding is that what's going on here is that deno uses swc to perform static analysis and possibly more, and that the static analysis phase is where this is going south.

And so, if I drop the compiled parser in your playground above (sorry about not giving a link, it's like 500k thanks to url encoding) then I do get the same result.

And I guess I think the answer is "it should compile" but that seems like kind of a smarmy thing to say

Actual behavior

As best as I can tell, it's crashing due to legitimate stack out-of-bounds access, due to the significant depth of recursion.

It's crashing around 250 recurses down, in Deno win64, which I gather has a default 1 meg stack.

And setting aside how funny it is that Deno's default stack is smaller than my emm386 configuration was back in the DOS days, also, I guess I feel like burning a meg to recurse 250 stack frames is a lot. That's 4k per frame.

Owen Wilson saying wow.

The heap is pretty cool too

RuntimeError: memory access out of bounds

RuntimeError: memory access out of bounds
    at swc_ecma_parser::parser::class_and_fn::<impl swc_ecma_parser::parser::Parser<I>>::parse_decorators::h164222227b3d5598 (https://cdn.jsdelivr.net/npm/@swc/[email protected]/wasm-web_bg.wasm:wasm-function[1318]:0x8a4d8e)
    at swc_ecma_parser::parser::expr::<impl swc_ecma_parser::parser::Parser<I>>::parse_primary_expr::h19e66b68d2beb39b (https://cdn.jsdelivr.net/npm/@swc/[email protected]/wasm-web_bg.wasm:wasm-function[77]:0x19fa02)
    at swc_ecma_parser::parser::expr::<impl swc_ecma_parser::parser::Parser<I>>::parse_member_expr_or_new_expr::h74fbb55ab42f21fe (https://cdn.jsdelivr.net/npm/@swc/[email protected]/wasm-web_bg.wasm:wasm-function[931]:0x77aa46)
    at swc_ecma_parser::parser::expr::<impl swc_ecma_parser::parser::Parser<I>>::parse_lhs_expr::hd2b41e42472cbc74 (https://cdn.jsdelivr.net/npm/@swc/[email protected]/wasm-web_bg.wasm:wasm-function[641]:0x63eb49)
    at swc_ecma_parser::parser::expr::ops::<impl swc_ecma_parser::parser::Parser<I>>::parse_unary_expr::h537c51507f320da4 (https://cdn.jsdelivr.net/npm/@swc/[email protected]/wasm-web_bg.wasm:wasm-function[855]:0x731371)
    at swc_ecma_parser::parser::expr::ops::<impl swc_ecma_parser::parser::Parser<I>>::parse_bin_expr::h6240bffaf1b976ea (https://cdn.jsdelivr.net/npm/@swc/[email protected]/wasm-web_bg.wasm:wasm-function[4205]:0xc9bdb6)
    at swc_ecma_parser::parser::expr::<impl swc_ecma_parser::parser::Parser<I>>::parse_assignment_expr_base::h34eaff0bd39f6104 (https://cdn.jsdelivr.net/npm/@swc/[email protected]/wasm-web_bg.wasm:wasm-function[670]:0x66349c)
    at swc_ecma_parser::parser::expr::<impl swc_ecma_parser::parser::Parser<I>>::parse_assignment_expr::h0a53920365a0bb34 (https://cdn.jsdelivr.net/npm/@swc/[email protected]/wasm-web_bg.wasm:wasm-function[6787]:0xe3ceaa)
    at swc_ecma_parser::parser::expr::<impl swc_ecma_parser::parser::Parser<I>>::parse_expr_or_spread::heed790056c7aa926 (https://cdn.jsdelivr.net/npm/@swc/[email protected]/wasm-web_bg.wasm:wasm-function[4887]:0xd21e19)
    at swc_ecma_parser::parser::expr::<impl swc_ecma_parser::parser::Parser<I>>::parse_args::{{closure}}::h39b12f2743a51cf4 (https://cdn.jsdelivr.net/npm/@swc/[email protected]/wasm-web_bg.wasm:wasm-function[1461]:0x8fb8bf)

Version

no idea. current i think? don't really know how to check

Additional context

I know my bug report was terrible and I'm sorry

image

StoneCypher avatar Aug 12 '22 20:08 StoneCypher

To make this issue a bit more standalone:

Simply trying to parse the generated code in question with swc is able to hit the default 1MB main thread stack limit on Windows. It seems to be due to the deep nesting, combined with swc's use of a hand written recursive descent parser.

Normally I would feel that such deep nesting is silly, and don't do that, but the fact that this is generated code by a commonly used parser generator, and that the grammar of the parser is not at all unreasonable are good counter arguments that this case needs to be supported.

The fact that the nesting limit is so low on windows suggests the stack frames contain a lot of data, which probably should be slimmed down. This could potentially allow for parsers with even larger grammars, but there would still be a limit somewhere.

Having some nesting limit due to the stack is pretty inherent to recursive descent parsers. The only real ways to avoid that are:

  1. To have swc migrate away from recursive descent parser (Which is probably an unreasonable ask).
  2. Have the swc parser get transformed to use some stackless alternative to recursion (probably not really realistic)
  3. Have the recursive descent parser utilize a segmented stack approach. This appears to be how rustc is handling things, utilizing the mentioned stacker library to avoid crashing from deeply nested recursion. It was added in rust pull request 55617.

I'm not sure how necessary trying to solve this for all possible programs actually is. If the stack frame sizes can be improved enough, perhaps the limit becomes sufficiently large that nobody really cares. But perhaps trying to solve it fully is actually quicker and simpler. I'm not really sure what the overhead of rustc's approach actually is in practice. It might be small enough to be worth it to avoid further whack-a-mole with the stack size limits.

KevinCathcart avatar Aug 13 '22 18:08 KevinCathcart

It may be worth noting that I appear to be the second instance of a wall move on Windows.

https://github.com/denoland/deno/issues/9752

It may also be worth noting that swc already reduced stack frame size for Deno once (twice if you count debug builds,) suggesting that a more permanent solution may be warranted

https://github.com/denoland/deno/pull/5509

https://github.com/swc-project/swc/pull/799

Admittedly, as a person Not Doing The Work ™, my opinion is of distinctly limited value here

StoneCypher avatar Aug 13 '22 21:08 StoneCypher

thank you @kdy1

StoneCypher avatar Aug 19 '22 00:08 StoneCypher

This issue has been automatically closed because it received no activity for a month and had no reproduction to investigate. If you think this was closed by accident, please leave a comment. If you are running into a similar issue, please open a new issue with a reproduction. Thank you.

swc-bot avatar Oct 03 '22 23:10 swc-bot

this should not be closed

i don't agree with the use of auto-close. it just hides problems that aren't easy to fix fast

StoneCypher avatar Oct 04 '22 01:10 StoneCypher

@kdy1 - I just saw the "need more info." Can I help?

StoneCypher avatar Oct 04 '22 01:10 StoneCypher

Also, there actually is a reproduction. How do I tell the bot that?

StoneCypher avatar Oct 04 '22 01:10 StoneCypher

Please file a new issue with the reproduction case

kdy1 avatar Oct 04 '22 02:10 kdy1

Oh my bad I thought I left a comment for asking repro case

kdy1 avatar Oct 04 '22 23:10 kdy1

I just closed this as reproducing a bug triggered by using Deno requires lots of time, so it's not a useful repro. I don't want to use an enormous amount of time for reducing such input. So please post a standalone reproduction case.

kdy1 avatar Oct 04 '22 23:10 kdy1

This closed issue has been automatically locked because it had no new activity for a month. If you are running into a similar issue, please create a new issue with the steps to reproduce. Thank you.

swc-bot avatar Nov 04 '22 00:11 swc-bot