Subham Kumar
Results
1
comments of
Subham Kumar
What's the current best option if I have to use this 4bit finetuned model using vLLM inference- Is it to convert it to 16bit and then perform the inference?