Try to optimize a couple of things
This PR seems to be faster when ADX is available, but I had troubles getting consistent results. I don't know if these changes for fmpz_mul and fmpz_sqr are actually faster.
Perhaps there are things one could optimize, and if so, I would be happy to do so.
Perhaps X_mat_neg could have a X_mat_inplace_neg instead.
NOTE: I think the assertion worker will fail due to fmpz_mul and fmpz_sqr allowing aliasing under certain circumstances when ADX assembly is enabled.
Hmm, I think I have to split this into smaller PRs.
Hmm, I think I have to split this into smaller PRs.
Yes.
The inplace functions look nice though. Is foo_neg_inplace a better name than foo_inplace_neg?
The inplace functions look nice though. Is
foo_neg_inplacea better name thanfoo_inplace_neg?
foo_method_inplace is nice because you can search for foo_method, and this will pop up. However, foo_inplace_method is somewhat more futureproof in case foo_inplace becomes it's own module. It is also nice to search for foo_inplace, and you see what inplace methods are available. I don't have a strong opinion here, I'll let you decide.
foo_method_inplace is nice because you can search for foo_method, and this will pop up.
That is my thinking.
However, foo_inplace_method is somewhat more futureproof in case foo_inplace becomes it's own module.
It sounds extremely specialized to have whole modules just for inplace versions of functions.
It is also nice to search for foo_inplace, and you see what inplace methods are available.
I think you are more likely to search for versions of a specific operation than for a group of inplace operations.