Tuning suite
Remove the n_mod part from #1991, and just focus on the tuning suite.
To keep this concrete, the goal of this PR is to lay the foundation for a somewhat modular tuning suite and to include tuners for
- [x]
n_xgcdandn_gcdinv, - [ ]
flint_mpn_mulhigh_nandflint_mpn_sqrhigh.
Btw, @fredrik-johansson do you know why the current table for k-values for Mulders' high multiplication never favors _flint_mpn_mulhigh_basecase? It feels like it should favor the basecase at least for n < 20.
Btw, @fredrik-johansson do you know why the current table for k-values for Mulders' high multiplication never favors _flint_mpn_mulhigh_basecase? It feels like it should favor the basecase at least for n < 20.
But it does for n = 10, 11, 12 :-)
It's not that surprising to me: for small n Mulders ends up doing three hardcoded multiplications which are extremely fast and for slightly larger n Karatsuba kicks in.
-
Why not implement both
n_xgcd_euclideanandn_xgcd_binaryso that one can experiment with both without invoking the build system (how we usually do it)? -
Why not use the existing
flint/src/module/tunedirectories? If the directory structure is to be changed, I think I'd rather useflint/tunethanflint/src/tune.
- Why not implement both
n_xgcd_euclideanandn_xgcd_binaryso that one can experiment with both without invoking the build system (how we usually do it)?
I'm thinking that the binary version is sort of useless if CPU has fast division. I don't really like the idea of having different versions of these sort of functions.
- Why not use the existing
flint/src/module/tunedirectories? If the directory structure is to be changed, I think I'd rather useflint/tunethanflint/src/tune.
I think it is nice if tuning things are gathered nicely together. And I would argue that the tuning is part of the source code, but I'm okay with flint/tune.