Arash Bakhtiari

Results 4 comments of Arash Bakhtiari

to create a LCE Lite package for iOS, first we need to create a LCE C API library similar to [TF Lite C API](https://github.com/tensorflow/tensorflow/blob/1210b521aa2226b01ee1bd9528a8f247b7283efb/tensorflow/lite/c/BUILD#L55). Then the LCE C API will...

this boils down to implementing a fast binary matrix-vector multiplication

@yingapple can you please confirm this is only related to OPT model or you have observed similar issue with other models as well? Could you also please provide us with...

@s-jse thanks for reporting this issue! Currently The DeepSpeed-FastGen [fused bias and activation kernel](https://github.com/microsoft/DeepSpeed/blob/master/deepspeed/inference/v2/kernels/core_ops/bias_activations/bias_activation.py) demands the number of channels be divisible by 8 as it takes advantage of vectorized instructions...