Subham Kumar

Results 1 comments of Subham Kumar

What's the current best option if I have to use this 4bit finetuned model using vLLM inference- Is it to convert it to 16bit and then perform the inference?