Wei (Neil) Su

Results 5 issues of Wei (Neil) Su

Summary: Add auto-vectorization implementation for int8-CPU-TBE API Differential Revision: D54286969

fb-exported
cla signed

Summary: Increase prefetching and reduce backend stall as is suggested by NVIDIA Differential Revision: D53552699

fb-exported
cla signed

Summary: Try auto-vectorize CPU TBE-NBit reference implementation code Differential Revision: D50142928

fb-exported
cla signed

Summary: Fix unused variable in github CI Differential Revision: D56366784

fb-exported
cla signed

Summary: Add CPU sequential TBE for int4 weight type and int4 output type Differential Revision: D60242110

fb-exported
cla signed