nanort
nanort copied to clipboard
[TODO] SIMD, BF16/FP16, INT8 optimization
Currently NanoRT does not utilize SIMD/AVX.
Also no quantized BVH support.
It'd be better to start to consider optimization and quantization.
Fortunately, recent CPU architecture(AlderLake, ZEN4) supports native BF16/FP16 and INT8 op support, which will boost quantized BVH construction/traversal.
We can utilize https://github.com/DLTcollab/sse2neon for SIMDizing code for SSE and NEON(Arm) target. (TODO: RISC-V SIMD)