font-rs
font-rs copied to clipboard
Runtime detection of SSE capabilities
Stable Rust now has an is_x86_feature_detected macro, which should be used to switch between SSE and fallback implementations based on runtime detection of the SSE capability.
Did you mean something like this
pub fn accumulate(src: &[f32]) -> Vec<u8> {
if is_x86_feature_detected!("sse") {
unsafe { accumulate_sse(src) }
} else {
let mut acc = 0.0;
src.iter()
.map(|c| {
// This would translate really well to SIMD
acc += c;
let y = acc.abs();
let y = if y < 1.0 { y } else { 1.0 };
(255.0 * y) as u8
}).collect()
}
}
Yes, very much like that. The comment can probably be adapted though :)
The comments can be left as bookmarks, to track the code copying patterns :D Can make a pull request tomorrow.
Also, the overhead of one branch instruction(for feature detection) as well as the zeroing of the result Vector(added by me in the accumulate_sse), keep bugging me.
Maybe something can be done with the vector zeroing. Not sure building for one specific CPU feature is a good alternative to a runtime feature detection.
Both of them, most likely insignificant and not worth the time, unless profiling tells otherwise.