jojac47

Results 2 issues of jojac47

Any Ideas as to why the first generation for a model instance is good but if I try to run that same instance with a new prompt it either returns...

I'm using a docker with the 12.1 nvidia/cuda container as a base. This worked perfectly for vllm unit the switch to using cupy. The cupy import breaks vllm whenever you...