cil icon indicating copy to clipboard operation
cil copied to clipboard

More efficient `breakString`

Open michael-schwarz opened this issue 10 months ago • 6 comments

Closes #169.

See also https://github.com/goblint/analyzer/issues/1513 that also has some runtime comparisons.

michael-schwarz avatar Jan 14 '25 16:01 michael-schwarz

This was a bit of a pain because I did not consider that building the result from right to left means that the accumulator cannot be used to pass in a prefix, I'll squash merge then.

I am confident this is correct now, it both passes all printing tests in this repo, as well as all cram tests in the goblint repo which also exercise this function.

michael-schwarz avatar Jan 14 '25 20:01 michael-schwarz

This now traverses the string right-to-left (because the Stdlib splitting does so because that's needed to construct a list), but the old version went left-to-right. Also, it looks like the document tree from this will be completely different, no? Leaning in a different direction, Lines in different places, no CText (which I assume is an optimization to keep trees smaller).

It looks like Pretty significantly prefers some shape over another and has flatten to improve things. So we need to be quite careful with these details: otherwise we might make breakString faster but later outputting it inefficient. Even if the default flattenBeforePrint would take care of it, constructing the correct tree directly in the first place would be good.

sim642 avatar Jan 16 '25 09:01 sim642

Judging by the comments a few lines below, this right-heavy tree actually seems preferable over the one they had previously:

https://github.com/goblint/cil/blob/f5ee39bd344dc74e2a10e407d877e0ddf73c9c6f/src/ocamlutil/pretty.ml#L116-L119

michael-schwarz avatar Jan 16 '25 15:01 michael-schwarz

https://github.com/goblint/cil/blob/f5ee39bd344dc74e2a10e407d877e0ddf73c9c6f/src/ocamlutil/pretty.ml#L227-L231

It seems like the new function I give constructs more or less the tree you would get after calling flatten?

michael-schwarz avatar Jan 16 '25 15:01 michael-schwarz

It seems like the new function I give constructs more or less the tree you would get after calling flatten?

That's true, which maybe is a good thing. Although it does also have an interesting effect: since flatten is written to do tail-recursion on the left, it does non-tail-recursion on the right. So when flattening a right-leaning tree, it actually goes into the very same deep recursion that the whole flattening process is trying to avoid later when doing the printing.

So it's not clear to me what of the old behavior should be retained (for the sake of just optimizing) and what's actually inconsequential (given how breakString wasn't really thought through for efficiency). Maybe it's not worth preserving all the old behavior after all.

sim642 avatar Jan 17 '25 10:01 sim642

Given we did not encounter a stack overflow for https://github.com/goblint/analyzer/issues/1513 where I also used an non tail-recursive helper, I would assume that this is also not problematic here.

If we run into any trouble, we can always go back to a more faithful replacement.

michael-schwarz avatar Jan 17 '25 13:01 michael-schwarz