convex
convex copied to clipboard
Static compilation of core symbols is error-prone
(Follow up after exposing the issue on Discord)
Enabling the static compilation of core symbols has resulted in many weird failures in my CVX code. This is because it essentially led to situations like this:
(def a (deploy '(do (def set 42) (defn get-set [] set))))
(a/get-set)
;; Returns core `set` instead of 42
What I am certain of is that core symbols should not be static because such situations are very hard to figure out unless you know exactly what is going on. If it happened to me to such an extent, it will happen to anyone and be a problem. I didn't even get to resolve everything, I just tried against a build with static compilation disabled and it solved all my problems. At best an inconvenience, at worse vulnerabilities in smart contracts.
Furthermore, I am now reconsidering the whole idea of :static. In Clojure, direct linking is only used as an optimization for uberjars where the code is set in stone and not expected to change. Many devs don't even bother turning it on. On a smart contract platform that is so dynamic, it now feels like a bad idea doing something similar.
There is a more general issue to think about here which is that of compilation units. The reason we see this is that:
- The whole quoted form gets compiled before it is executed, hence
setis still defined in core at that point - Static replacement happens in the compilation phase
- Definition happens at the execution phase (too late for the redefinition of
setin this case)
People have run into a similar issue defining macros in the same code body where they are used.
We should think about possible general solutions. I see some interesting ideas:
- Delaying compilation of certain code so that definition happen first
- Placeholder definitions (would overwrite metadata and prevent things like
:staticbeing an issue - Breaking up compilation units
- Being more strict about when / how definitions can occur
I'm really not sure it is statically possible to partly delay compilation in a way that would solve that kind of problems. Factoring in that compilation can happen both on and off chain, I probably wouldn't attempt making the compiler any more complex.
Besides the well known issues about macros, we were in a good place: what you see is what you get. Issues with macros are not that big a deal: they are an expert feature, empirically they are seldom used (e.g. Clojure), and they are literally expensive because they de-facto require on-chain compilation. I foresee expert users will rather use off-chain macros while compiling transaction code but at that point, we are talking about people who clearly know what they do.
On the other hand, any counter-intuitive behavior with such a core building block as definition will lead to chaos and despair. Unless a benchmark shows such an impressive perf gain that we can't ignore it, I would probably just disable static compilation (aka direct linking) and not overthink it. At least for the time being and if we get more resources pre-launch, we can always reopen the case. WDYT?
I did benchmark the effect of static compilation using Criterium. 2 pre-compiled pieces of code.
"Simple":
(conj [] (inc (* 2 42)))
"Complex":
(do
(defn math [x y]
(* x
(+ y x)))
(loop [acc []
i 10]
(if (zero? i)
(first acc)
(recur (conj acc
(math i i))
(dec i)))))
Results:
SIMPLE NO STATIC
Evaluation count : 194082180 in 60 samples of 3234703 calls.
Execution time mean : 308,320613 ns
Execution time std-deviation : 0,711513 ns
Execution time lower quantile : 307,341778 ns ( 2,5%)
Execution time upper quantile : 309,307284 ns (97,5%)
Overhead used : 1,825324 ns
SIMPLE STATIC
Evaluation count : 613894380 in 60 samples of 10231573 calls.
Execution time mean : 102,027353 ns
Execution time std-deviation : 6,832655 ns
Execution time lower quantile : 95,686450 ns ( 2,5%)
Execution time upper quantile : 112,184479 ns (97,5%)
Overhead used : 1,829865 ns
COMPLEX NO STATIC
Evaluation count : 5341500 in 60 samples of 89025 calls.
Execution time mean : 11,245868 µs
Execution time std-deviation : 52,553503 ns
Execution time lower quantile : 11,188530 µs ( 2,5%)
Execution time upper quantile : 11,391687 µs (97,5%)
Overhead used : 1,825324 ns
COMPLEX STATIC
Evaluation count : 9249000 in 60 samples of 154150 calls.
Execution time mean : 6,441758 µs
Execution time std-deviation : 12,685712 ns
Execution time lower quantile : 6,420361 µs ( 2,5%)
Execution time upper quantile : 6,458891 µs (97,5%)
Overhead used : 1,829865 ns
I didn't go much further but the effect of static compilation looks very significant at first. It might double the CVM throughput on average. Although it might be less true for real contract code. However, we certainly don't have an execution bottleneck at the moment and working on this is not a quick win.
I put this in my list of pre-launch optimizations.
OK based on the performance impact I think we want to keep static inlining of core functions switched on by default. Main question is how to mitigate people shooting themselves in the foot.
Wait, isn't it the opposite of what we were saying today? Leave it off by default for now and we'll try to optimize for that pre-launch?
Must have got our wires crossed - I thought we said to keep the static compilation for now :-) ?
I think we should stick with static compilation because
- It's clearly a lot faster so we will likely want to use it in some form
- it will flush out problems / risks in user code which might inform good mitigations / solutions prior to launch.
Agreed main issue to solve is to help people not shoot themselves in foot here. We definitely want static comp.