Changheon Lee
Results
2
comments of
Changheon Lee
trafficstars
I downloaded your vllm branch w8a8 but i faced this case of error. should i add int8LlamForCausalLM in smoothquant ?? ValueError: Model architectures ['Int8LlamaForCausalLM'] are not supported for now. Supported...
Thanks for your answer. And did you apply partial quantization which mean that down_proj layer remain as a fp16 because of big activation range. as you know there is comment...