daanno2
daanno2
> I build from source for torch and vllm, with latest triton, latest nccl 2.26.2 (verified by print(torch.cuda.nccl.version())), cuda 12.8, latest driver 570.124.04. On two 5090Ds, vllm can serve QwQ-32B-AWQ...
I was able to initially bypass the error by hardcoding self.manual_cast_dtype in _ComfyUI\comfy\model_base.py_: `self.manual_cast_dtype = model_config.scaled_fp8#self.manual_cast_dtype = model_config.manual_cast_dtype` Which led to this error: ` File "C:\Tools\ComfyUI_windows_portable\ComfyUI\comfy\ldm\flux\model.py", line 198, in forward...