candle
candle copied to clipboard
Add AVG Pooling cpu implementation
Try to resolve https://github.com/huggingface/candle/issues/2294
@WenheLI I think this is correct.
@EricLBuehler - Thanks! I guess another question is that in the CPU backend implementation, we should be able to speed this up by using vectorization. Not sure in candle's codebase, do we already have some infrastructure that can help us?
Hi @WenheLI, I think you could use something like Rayon, just replace the for loops (probably just choose one to replace as rayon uses the number of CPU cores as the number of threads by default) and replace .iter() with .par_iter().
Thanks! Added vectorization. Wondering if someone can I take a look and review this?