dave

Results 4 comments of dave

Actually, the nvcc is ok to run as these: root@f8c2e06fbf8b:/mnt/vllm# nvcc -v nvcc fatal : No input files specified; use option --help for more information root@f8c2e06fbf8b:/mnt/vllm# nvcc -V nvcc: NVIDIA...

there is cuda: root@f8c2e06fbf8b:/mnt/vllm# echo $CUDA_HOME /usr/local/cuda root@f8c2e06fbf8b:/mnt/vllm# type nvcc nvcc is /usr/local/cuda/bin/nvcc github.com/vllm# python3 -c "import torch; print(torch.cuda.is_available()); print(torch.__version__);" True 2.1.0a0+32f93b1

yeah, need to add all of arbitrary endpoints support... e.g. Qwen3-235B-A22B-Instruct-2507, Qwen3-Coder-30B-A3B-Instruct, Qwen3-VL-8B-Instruct, Qwen3-VL-4B-Instruct... up to users for llama-server, no limit, thanks

cmake -B build cmake --build build --config Release cd build make install ldconfig then everything llama-xxx should be ok