Ze Wang
Results
2
comments of
Ze Wang
docker run --gpus '"device=0,1,2,3"' \ --shm-size 1g \ -p 8081:80 \ -v /home/unionlab001/Model/qwen-72b:/data ghcr.io/predibase/lorax:latest \ --model-id /data/Qwen1_5-72B-Chat \ --trust-remote-code \ --quantize bitsandbytes-nf4 \ --max-batch-prefill-tokens 300 \ --max-input-length 200 \ --max-total-tokens...