sincos
sincos copied to clipboard
Have you considered using CORDIC
Hi @jamesbowman nice little library. Good algorithm when you are running on CPU / FPGA with fast HW multiplier. As for Verilog implementation, have you considered using CORDIC algorithm. To my experience it requires way less gates and more pipeline friendly.
hmm: https://github.com/cebarnes/cordic
@jamesbowman yep, something like this.
@jamesbowman what is the target application for this library?
Good summary:
http://www.andraka.com/files/crdcsrvy.pdf
@jamesbowman I cant access the link. for some reason. Is it correct?
(Looking at this because my verilog sincos is suddenly a timing critical path.)
A straight 1-cycle implementation of 16-bit CORDIC must do 15 additions, inspecting the MSB of each addition. I don't think this will meet timing! Apart from inserting pipelining, is there any way to shorten this critical path?

@jamesbowman this a naïve way of computing CORDIC, and it is impractical.
Yes, you would need some pipelining to get 16-bit resolution.
To get a good and fast CORDIC implementation you would probably have to answer few questions.
Do you want 'generic' CORDIC? Meaning: All inputs (x0, y0, z0) are non-zero, and All outputs (xn, yn, zn) are in use, and need to be accurate.
With 'generic' CORDIC you can do (polar -> rectangular) or (rectangular -> polar) or mix.
Real savings can be achieved when you need only (rectangular -> polar) only. When you compute atan(x/y) for example. Is it your case?
The module is only computing sin(x), in Verilog:
module isin(
input signed [15:0] x,
output signed [15:0] s);
so that is (polar -> rectangular) with r=1. So x0 is a constant, y0 is zero, z0 is the angle Only the sine output (yn) is required.
Only 12 bits of result (that is, sign + 11 bits) are really needed.
in pure (rectangular -> polar) mode you can start with one of MSB lookup tables:
2-bit. gives you a quadrant. 3-bit. need to store 2-value (0.707, 1) 4-bit. 4-value table. 5-bit. 8-value table. etc.
You need to pre-scale values in the table depending of how many stages you will do after.
For the second stage you can lookup next group of bits (2-5) to decide on batch sequence of rotations that you want to perform. Batch rotations don't require intermediate sign values, so they can be done in Sign-Digit (SD) form https://en.wikipedia.org/wiki/Signed-digit_representation
@jamesbowman just to remind that single CORDIC step yields less the 1 bit of precision.
Good paper to read: rodrigues2010.pdf