Vlad J

Results 2 comments of Vlad J

this has worked for me, i've got 3 3090 . python -m vllm.entrypoints.openai.api_server \ --model ./stelterlab_openhands-lm-32b-v0.1-AWQ \ --tensor-parallel-size 1 \ --pipeline-parallel-size 3 \ --quantization awq_marlin \ --dtype float16 \ --max-model-len...