timere icon indicating copy to clipboard operation
timere copied to clipboard

Angstrom-based parser causes stack overflow in JavaScript output

Open zbaylin opened this issue 1 year ago • 7 comments

This was pretty hard to debug because any JS executable that includes timere or timedesc immediately crashes. To replicate, simply create an empty OCaml file with

 (libraries timedesc)
 (modes js)

Then running node on the generated file with a large stacktrace limit and source maps enabled:

node --stack-trace-limit=9999999999999999 _build/default/src/timere_empty.bc.js

This produces a (very) long stacktrace with this at the bottom:

    ...
    at caml_call3 (/Users/zbaylin/Development/timere-js-test/_build/default/src/timere_empty.bc.js:32762:28)
    at half_compressed_of_string (/Users/zbaylin/Development/timere-js-test/_build/default/src/timere_empty.bc.js:35666:17)
    at half_compressed_of_string_exn (/Users/zbaylin/Development/timere-js-test/_build/default/src/timere_empty.bc.js:35670:17)
    at /Users/zbaylin/Development/timere-js-test/_build/default/src/timere_empty.bc.js:35687:25
    at Object.<anonymous> (/Users/zbaylin/Development/timere-js-test/_build/default/src/timere_empty.bc.js:38071:3)

half_compressed_string lives here: https://github.com/daypack-dev/timere/blob/f8e2fd5e1d6fa415a4741da8cf311f8c4dbca921/timedesc/time_zone.ml#L651, which seems to be the culprit.

I assume this has to do with the new Angstrom-based parser, but I haven't gone into the semantics to figure out why.

zbaylin avatar Sep 20 '22 20:09 zbaylin

Oh huh, this is curious - my recollection is browser runs fine with it, though maybe that is a faulty recollection as well.

@glennsl were you using JS version of timedesc? Do you have any similar experience?

darrenldl avatar Sep 21 '22 07:09 darrenldl

I'll try adding more commit (or whatever that stops Angstrom from backtracking) meanwhile...

darrenldl avatar Sep 21 '22 07:09 darrenldl

@glennsl were you using JS version of timedesc? Do you have any similar experience?

I can't recall having any technical issues at all! But the project I was using it for unfortunately collapsed (for entirely unrelated reasons!) so I haven't tried it after you overhauled the dependencies.

glennsl avatar Sep 21 '22 11:09 glennsl

so I haven't tried it after you overhauled the dependencies.

Ah, oh well.


I tried adding some commits to no avail, but found out that the number passed to Angstrom.count matters a lot:

      let half_compressed : string M.t Angstrom.t =
        BE.any_uint16 >>=
        (fun table_count ->
           count table_count (commit *> half_compressed_name_and_table) >>|
           (fun l ->
              l
              |> List.to_seq
              |> M.of_seq
           )
        )

If I swap table_count with 100 it's fine, but crashes at 200 - the total number of tables is in the 200's I think.

So I suspect count might be the culprit here, though I am still not sure what the proper fix should be...

darrenldl avatar Sep 21 '22 13:09 darrenldl

Okay, someone made the presumably same discovery: https://github.com/inhabitedtype/angstrom/issues/221

darrenldl avatar Sep 21 '22 13:09 darrenldl

Using the tail-recursive version mentioned in the issue doesn't seem to help - probably failed to compile that to loop during JS compilation.

Looks like I'll have to hand roll a parser for this.

darrenldl avatar Sep 21 '22 13:09 darrenldl

@zbaylin I pushed a fix and tried it with nodejs and seems to be working now - can you try pinning to main branch and see if the issue is gone (in both test setup and the actual code)?

Thanks for investigating btw!

darrenldl avatar Sep 21 '22 15:09 darrenldl