daanno2 comments

Repositories
Issues
Comments

Results 2 comments of


                                            daanno2

[Doc]: Steps to run vLLM on your RTX5080 or 5090!

> I build from source for torch and vllm, with latest triton, latest nccl 2.26.2 (verified by print(torch.cuda.nccl.version())), cuda 12.8, latest driver 570.124.04. On two 5090Ds, vllm can serve QwQ-32B-AWQ...

manual cast: torch.bfloat16 when using fp8 combined flux.dev models causing vram issues with LoRAs

I was able to initially bypass the error by hardcoding self.manual_cast_dtype in _ComfyUI\comfy\model_base.py_: `self.manual_cast_dtype = model_config.scaled_fp8#self.manual_cast_dtype = model_config.manual_cast_dtype` Which led to this error: ` File "C:\Tools\ComfyUI_windows_portable\ComfyUI\comfy\ldm\flux\model.py", line 198, in forward...