flint Tuning suite

Remove the n_mod part from #1991, and just focus on the tuning suite.

To keep this concrete, the goal of this PR is to lay the foundation for a somewhat modular tuning suite and to include tuners for

[x] n_xgcd and n_gcdinv,
[ ] flint_mpn_mulhigh_n and flint_mpn_sqrhigh.

Btw, @fredrik-johansson do you know why the current table for k-values for Mulders' high multiplication never favors _flint_mpn_mulhigh_basecase? It feels like it should favor the basecase at least for n < 20.

May 27 '24 23:05 albinahlback

Btw, @fredrik-johansson do you know why the current table for k-values for Mulders' high multiplication never favors _flint_mpn_mulhigh_basecase? It feels like it should favor the basecase at least for n < 20.

But it does for n = 10, 11, 12 :-)

It's not that surprising to me: for small n Mulders ends up doing three hardcoded multiplications which are extremely fast and for slightly larger n Karatsuba kicks in.

May 28 '24 07:05 fredrik-johansson

Why not implement both n_xgcd_euclidean and n_xgcd_binary so that one can experiment with both without invoking the build system (how we usually do it)?
Why not use the existing flint/src/module/tune directories? If the directory structure is to be changed, I think I'd rather use flint/tune than flint/src/tune.

Jun 03 '24 07:06 fredrik-johansson

Why not implement both n_xgcd_euclidean and n_xgcd_binary so that one can experiment with both without invoking the build system (how we usually do it)?

I'm thinking that the binary version is sort of useless if CPU has fast division. I don't really like the idea of having different versions of these sort of functions.

Why not use the existing flint/src/module/tune directories? If the directory structure is to be changed, I think I'd rather use flint/tune than flint/src/tune.

I think it is nice if tuning things are gathered nicely together. And I would argue that the tuning is part of the source code, but I'm okay with flint/tune.

Jun 04 '24 14:06 albinahlback