Johannes Demel
Johannes Demel
As long as the output signature is `32f`, roots of negative values do not exist. [The C reference for sqrt](https://en.cppreference.com/w/c/numeric/math/sqrt) says: 1. `+-0` should return the input value 2. Values...
The `invsqrt` kernel needs to stay until we do another major release because we follow semver and can't just remove a kernel. We can remove a specific implementation of a...
I'd argue that a specialized test would be great in this case.
@Aang23 first off, a fast LDPC decoder implementation is pretty cool. We have some convolutional and polar decoder code in VOLK. LDPC might fit as well. Speaking of those other...
So which functions do you suggest to port to VOLK? The inner loop with SIMD instructions? The whole `decoder`? Oh wait: This function specifically: https://github.com/altillimity/SatDump/blob/df2fbdf67540a1feb5fa04f5907a846f7b9a456c/plugins/simd_extensions/simd_avx2/ldpc_decoder/ldpc_decoder_avx.cpp#L121 At the moment, we'd have...
My biggest worry with this approach would be that VOLK currently tries to not have state. At least none of the kernels do. This would drastically change with this kernel....
I'd be in favor of adding RiscV support for VOLK. Also, we already run QA tests for RiscV and we already have some hand optimized assembly for this ISA. Though,...
Is `riscv_vector` a good name for this architecture? The RiscV naming system is getting a bit confusing.
The [GCC RISC-V architecture options](https://gcc.gnu.org/onlinedocs/gcc/RISC-V-Options.html) imply that we could optimize for any given CPU. That'd be to many machines to compile, I assume. The GCC argument would potentially be: ```bash...
I suggest to use intrinsics. So far, we mostly use intrinsics and only rarely use ASM. Also, we tend to replace the ASM code with intrinsics when they're available. If...