font-rs icon indicating copy to clipboard operation
font-rs copied to clipboard

Runtime detection of SSE capabilities

Open raphlinus opened this issue 6 years ago • 3 comments

Stable Rust now has an is_x86_feature_detected macro, which should be used to switch between SSE and fallback implementations based on runtime detection of the SSE capability.

raphlinus avatar Sep 18 '18 20:09 raphlinus

Did you mean something like this

pub fn accumulate(src: &[f32]) -> Vec<u8> {
    if is_x86_feature_detected!("sse") {
        unsafe { accumulate_sse(src) }
    } else {
        let mut acc = 0.0;
        src.iter()
            .map(|c| {
                // This would translate really well to SIMD
                acc += c;
                let y = acc.abs();
                let y = if y < 1.0 { y } else { 1.0 };
                (255.0 * y) as u8
            }).collect()
    }
}

6D65 avatar Sep 19 '18 05:09 6D65

Yes, very much like that. The comment can probably be adapted though :)

raphlinus avatar Sep 19 '18 06:09 raphlinus

The comments can be left as bookmarks, to track the code copying patterns :D Can make a pull request tomorrow.

Also, the overhead of one branch instruction(for feature detection) as well as the zeroing of the result Vector(added by me in the accumulate_sse), keep bugging me.

Maybe something can be done with the vector zeroing. Not sure building for one specific CPU feature is a good alternative to a runtime feature detection.

Both of them, most likely insignificant and not worth the time, unless profiling tells otherwise.

codri avatar Sep 19 '18 17:09 codri