A.J
A.J
### System Info **Target:** x86_64-unknown-linux-gnu **Cargo version:** 1.69.0 **Docker label:** N/A **CUDA driver version:** 12000 **CUDA bare metal version:** 11.8 **Pytorch CUDA version:** 2.0.1+cu118 **System's CUDA version:** 11.8 **GPU supported...
### Feature request Control the benchmarking utility output, format and location, for running the benchmarking using system Python wrappers as `subprocess` or `pexpect`. Writing the output to a file might...
I am attempting to download model checkpoints from the Azure Blob Model Registry associated with this repository. However, access denial errors appear to suggest that there's no public access to...
### 📚 The doc issue **Context:** During July 9, 2024, vLLM open office hours (FP8), there were several questions regarding how to **optimize** model deployment inference configurations targeting the two...