blueoil
blueoil copied to clipboard
Optimize Add implementation
According to the performance measurement of LmResnet for ImageNet on FPGA, there is some low hanging fruits, Add
operator is now consuming over 20ms.
@tkng Try #1027 !!