catala
catala copied to clipboard
Compiler optimization impact the log generation
We discovered with @denismerigoux that the compiler optimization (-O
flag) doesn't preserve the entire log trace.
I started having a look, and there are a few optimisation cases indeed where the log annotations are just dropped. It's possible to detect these cases, perform the optimisations and re-add the logging calls around the result.
However, the problem is deeper than that: the optimiser assumes that the language is purely functional when, for example, it performs beta-reductions. Logging calls being side-effects, they could get discarded or duplicated…
So fixing this might involve re-thinking a bit how the optimiser works, or have special, more careful handling of logging calls. Could beta-reduction actually be harmful in other cases (I am thinking of exceptions ?)
Or maybe there are some invariants I am not yet aware of that ensure good properties for these transformations ?
I started having a look, and there are a few optimisation cases indeed where the log annotations are just dropped. It's possible to detect these cases, perform the optimisations and re-add the logging calls around the result.
Yep that would be a good solution.
However, the problem is deeper than that: the optimiser assumes that the language is purely functional when, for example, it performs beta-reductions. Logging calls being side-effects, they could get discarded or duplicated…
We don't do any beta-reduction in the Catala compiler as of now. A while back, @lIlIlIlIIIIlIIIllIIlIllIIllIII implemented
https://github.com/CatalaLang/catala/blob/1f4e869c33dfc1a8f0b7cf892b7a47d340429f30/compiler/lcalc/optimizations.ml#L107-L113
But as the comment says it is not used anywhere. The logging calls indeed perform side effects but I didn't include them in the semantics of the language. Indeed, semantics-wise the logging calls are equivalent to the identity function, their side effects are part of the TCB.
I don't think we need to rethink the semantics of the language to include side-effects so that it is neatly reflected in the compiler. That would add a lot of complexity and force us to leave the nice purity of the language. The way I see these logging calls is more like the Pos.t
in the AST: when performing optimizations, you have to figure out how to propagate them in the correct way to have error messages that make sense, but it will never affect the returned value of the code.