Manu Maheshwari

Results 3 comments of Manu Maheshwari

For llama2-7b q4fp16_1 quantization and a context length of 128, these are the context phase time differences - I installed it using pip around a week back. MLC-LLM - 266.3...

The gemm times itself are very huge for the context phase