Vlad J
Results
2
comments of
Vlad J
this has worked for me, i've got 3 3090 . python -m vllm.entrypoints.openai.api_server \ --model ./stelterlab_openhands-lm-32b-v0.1-AWQ \ --tensor-parallel-size 1 \ --pipeline-parallel-size 3 \ --quantization awq_marlin \ --dtype float16 \ --max-model-len...