nebuly
nebuly copied to clipboard
Benchmarks
Thanks for this work. It would make sense:
- to explain if a speed-accuracy trade-off exists and if so, how much
- add benchmarks of your optimization for various applications and architectures.
Thanks ogencoglu. We are working right these days on building use cases to show both how to use the library in a more granular way and to provide benchmarks on SOTA model acceleration on various hardware devices (mainly CPUs and GPUs). We will try to provide some benchmarks starting from the next release!
And as for the speed-accuracy tradeoff, we actually want to avoid any drop in accuracy. Exactly for this reason we are not using methods like pruning and quantization. It may happen that some of the supported AI compilers use some accuracy reduction strategies. However, in our tests so far we have not detected any significant drop in performance