Igor van den Hoven
Igor van den Hoven
When it comes to performance testing I always uncomment this line in bench.c ``` //#define cmp(a,b) (*(a) > *(b)) // uncomment for fast primitive comparisons ``` That allows a fair...
I took a closer look at this. As far as I can tell, overall branchless swap performance is worse for gcc and clang on my hardware. Ideally, you get that...
@Voultapher https://github.com/Voultapher/sort-research-rs/blob/main/writeup/glidesort_perf_analysis/text.md Just saw your benchmark. I've recently released a fluxsort and quadsort update with compile-time optimizations for clang. Overall, quadsort should be the fastest sort for random when compiled...
On my own system, monobound still outperforms on powers of 2. When compiled with clang it slows down dramatically, as does monobound. This had made me wonder if either are...
WSL 2 gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04) on an i3-8100. I haven't looked at the assembly, but the benchmark shows that clang runs the code as branched. Using `-mllvm -x86-cmov-converter=false`...
p.s. Also keep in mind that when you start getting down to the micro second, pretty irrelevant things start playing a role. You can make an irrelevant code change, and...
From a compatibility / performance perspective, the main importance is that software that works with `a > b` (unstable if true) still works with `a - b`, even if the...
Looks like the problem is in telopt.c with the code starting with: ``` if (HAS_BIT(d->mth->comm_flags, COMM_FLAG_REMOTEECHO)) ``` Try replacing that code block with: ``` if (HAS_BIT(d->mth->comm_flags, COMM_FLAG_REMOTEECHO)) { pto =...
Marshall Lochbaum created a derived sort a while back that utilizes counting sort. https://github.com/mlochbaum/distcrum
It's explained in the help file: ``` [ ] . + | ( ) ? * are treated as normal text unlessed used within braces. Keep in mind that {...