Chenyang Guo
Results
1
issues of
Chenyang Guo
https://github.com/NVIDIA/FasterTransformer/blob/main/docs/vit_guide.md#int8-vs-fp16-speedup-on-vit model : vit_B_16 device: A100 bs: 32 we use quant_mode=ft1 and the speed is almost the same with FP16. So is there any update on this case?