universal
universal copied to clipboard
investigate if we can decouple an expression template front-end from an arithmetic back-end
boost::multiprecision is organized in a number class front-end and an arithmetic back-end
https://www.boost.org/doc/libs/1_75_0/libs/multiprecision/doc/html/boost_multiprecision/perf/overhead.html
The number class provides expression template functionality, but adds a little bit of overhead: for simple types about 0.5%. So a 100MOPS code would run at 95MOPS with expression templates enables.
We know that for small types, the extra processing inside the expression templates to avoid temporaries is not going to be a win, but for arbitrary precision types that could need thousands of bytes, any copies avoided will be a big win.
Since Universal is mostly about providing high-performance, tailored number systems that will have a hardware data path executing them, expression templates are not all that attractive, but as we are planning to incorporate arbitrary precision number systems, having a pluggable expression template front-end would be attractive.
The key comparisons we should use to drive this implementation are:
- 32bit posits
- 256bit posits posits are designed to provide oracle services to other number systems without being arbitrary precision. 256bit posits also have hardware support and run at terra-op order (>10^12 operations per second)
- 64bit unum type 1 64bit type 1 is a dynamic precision format with at most 64bit representations
- 512-bit fixed-points the 512-bit fixed-point format is the base type of the quire for 32-bit posits
- arbitrary precision floats
- arbitrary precision integers
- arbitrary precision posits this is still an open question: arbitrary precision posits would basically forgo the benefits of tapering, but it would provide again a nice oracle mechanism for validation studies.