Atchuth Naveen Ch comments

Repositories
Issues
Comments

Results 3 comments of


                                            Atchuth Naveen Ch

What kind of layers are optimized by torchao on a RTX 4090?

Thank you @supriyar for pointing me to the gemlite kernels and thanks to @mobicham for the work on gemlite. I am able to optimize my model using both torchao int4...

What kind of layers are optimized by torchao on a RTX 4090?

My bad, I have only evaluated on a batch size of 1. With greater batch sizes, I see the performance gain from using gemlite kernels.

What kind of layers are optimized by torchao on a RTX 4090?

Hi, I ran the script at https://github.com/mobiusml/hqq/blob/master/examples/hqq_lib_demo.py to cross check for batch-size=1. It seems the `torchao` backend is running faster at 156 tok/s vs gemlite backend at 116 tok/s. What...