rowan-fan

Results 9 comments of rowan-fan

大佬,带带弟弟!

I have the same problem as you. It is a problem with the ComfyUI-RVC node. It is recommended to use Comfy-RVC. The following problems still exist. ```python3 unknown args: ['--listen']...

Same promblem. ```bash Traceback (most recent call last): File "/root/ComfyUI/nodes.py", line 1993, in load_custom_node module_spec.loader.exec_module(module) File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed File "/root/ComfyUI/custom_nodes/comfyui-workspace-manager/__init__.py", line...

Deepseek R1 使用vllm部署时需要至少2个节点,vllm原生提供了分布式部署的能力,但是目前看xinference的架构应该没有考虑到对vllm夸节点TP和PP的支持。估计短时间内很难支持vllm全量部署DeepSeek R1 了。

> Set the parameter: --max_num_batched_tokens, refer to https://docs.vllm.ai/en/stable/performance/optimization.html I followed your suggestion and set the parameters --max-num-seqs and --max-num-batched-tokens to 1, but the CUDA out of memory error still occurs....

> Try disabling CUDA Graph by `--enforce-eager`. I tried disabling CUDA Graph using --enforce-eager, but the same error persists. ```bash python3 -m vllm.entrypoints.openai.api_server \ --host 0.0.0.0 \ --port 9997 \...

Same issue for Qwen3-Coder-30B-A3B-Instruct. Are there any way to solve it ?