sleef
sleef copied to clipboard
Add generators of headers for inlining whole sleef functions
There have been several requests for providing vectorized SLEEF functions in a header file,
https://github.com/shibatch/sleef/issues/230
It turned out that this functionality is not too hard to implement. We can use C preprocessor (cpp) to generate such header files. We will apply cpp a few times to sleefsimddp.c and sleefsimdsp.c with SLEEF_GENHEADER macro and other appropriate macros defined. Part of the codes in misc.h and helper files will be guarded with #if, so that irrelevant lines will not appear in the resulting header files.
A header file is generated for each vector extension. The resulting header files contain inline functions defined in dd.h and df.h, and functions defined in sleefsimddp.c and sleefsimdsp.c with all the function names renamed to the ones specified in the standard SLEEF API.
Since the inline functions in helper files are not renamed, those header files with different vector extensions cannot be included from a same file.
An example of generated header file can be seen at the following URL.
http://www.aist-nara.ac.jp/~n-sibata/header_example/sleefinline_avx2.h
Please see the example of generated header file and let me know you thought.
Below is another example.
http://www.aist-nara.ac.jp/~n-sibata/header_example/sleefinline_sse2.h http://www.aist-nara.ac.jp/~n-sibata/header_example/helloinline.c http://www.aist-nara.ac.jp/~n-sibata/header_example/helloinline.s
Please see that the function is inlined in helloinline.s.
This sounds good to me in principle. One thing that you should make sure, it that duplication is minimal, if any is needed at all. This is to avoid having to update code in both the library and the header files when an algorithm is changed. Does that make sense?
There will be almost no code duplication. Those header files are generated from the existing sleefsimddp.c and sleefsimdsp.c. Some of the macros have to be written in a special format in order to process them in a later stage.
Sounds good to me!
I'm closing this issue as I believe it was solved by #283. Feel free to re-open if not the case or open a new issue.