volk icon indicating copy to clipboard operation
volk copied to clipboard

Add 4ic deinterleave to 8i x2

Open dkozel opened this issue 4 years ago • 2 comments

I think this can be done pretty efficiently, if bytewise arithmetic shift operators exist. Otherwise its probably just loop unrolling? The volk_8ic_deinterleave_16i_x2 kernels are much more complicated than I expected though so I'm probably not aware of a lot of nuances of available SIMD operations.

uint8_t input[size];
uint8_t out_1[size];
uint8_t out_2[size];

for (int i = 0; i < size; i++) {
    out_1[i] = input[i] << 4;
    out_1[i] = out_1[i] >> 4;
    out_2[i] = input[i] >> 4;
}

dkozel avatar Jul 31 '20 22:07 dkozel

So let's see, you propose a new kernel volk_4ic_deinterleave_8i_x2? Do you have a use case? I have an idea how to use such low resolution values. But I'd suggest a LUT instead of shifts. Are you willing to implement a first kernel?

jdemel avatar Aug 02 '20 09:08 jdemel

@jdemel Yes, I'm writing blocks for a Radio Astronomy acquisition board which stream packed signed 4bit IQ data. Yes. I'll put up a PR shortly with nearly complete generic and SSE2 kernels, though I have some uncertainty about dispatchers and input datatypes as there isn't a native 4bit type in C++.

dkozel avatar Aug 04 '20 19:08 dkozel