Gabriel Scherer
Gabriel Scherer
I wonder if code-alignment effects could play a role here. Do you observe the same results if you swap the order of the functions in your source file? Note: unless...
@dustanddreams so you observe a performance difference between 5.2 and trunk? There is a large absolute difference between your 4.14 and trunk numbers (~30ms on 4.14, ~90ms on trunk), is...
On my (Linux, amd64) system there is no such difference between 4.14 and 5.1. ``` Summary 5.1 unsafe, unrolled ran 1.06 ± 0.03 times faster than 4.14 unsafe, unrolled 1.25...
I think that "a constant number of time" and "never" are similar for performance analysis. For example I wondered if it could be the case that some active security system...
I wonder why there is such a performance cliff between the unrolled and non-unrolled version. Would you maybe play with unrolling 2,3,4,5 times to see what happens? Naively one would...
@dustanddreams you reproduced this issue on a non-mac arm64 system. Are you still using Apple-produced arm64 hardware, or an arm CPU from some other vendor?
Note: @maranget (private conversation) grumbled that the fence we are talking about may not actually be necessary on Apple-provided arm64 machines, or in fact most arm64 machines, which implement a...
I am worried that existing packagers of OCaml for Linux distributions could rely on the current interface to pass hardening flags, so for them the change would be a regression....
> The current PR does not fulfill them in its present form, and I think your needs can be addressed in a separate PR because they are distinct. Maybe this...
I believe that this is a step in the right direction -- hardcoding a fixed maximum of domains is bound to create issues sooner or later. I tried to use...