quda
quda copied to clipboard
zMobius implementation
We have been experimenting with using complex b[s] and c[s] which appears to give similarly good approximation to Shamir compared to Mobius with only real coefficient. Supporting zMobius would either change b_5[] and c_5[] to complex, or add similar arraies for the imaginary parts.
Ok, then this change is fairly trivial and I could likely do it in an hour or so. A couple of questions:
- I presume that this will have zero effect, or close to zero effect on performance. If it has negligible effect then we could just maintain a single zMobius implementation to avoid code bloat and library footprint. Is this what is done on other architectures?
- When doing zMobius, would you want flops measured taking into account the complex multiplication?
Sorry for closing the issue accidentaly. Re-opening now. In CPS and BFM zMobius is superficially different from Mobius, Shamir, etc , because b[s] and c[s] are calculated inside rather than defined directly, but keeping a single implementation is certainly fine.
As for the flop count. There would be some uncertainties as some of imaginary parts are often zeros, but adding flops for imaginary parts is probably more accurate if there would be one implementation? Having said that, I don't feel strongly either way.
@maddyscientist Status unchanged on this?