igerry comments

Results 5 comments of


                                            igerry

[Feature] support Qwen3-235B-A22B-Instruct-2507

It supports it already since 0.3.1, the latest 0.3.2 works without issue. python ktransformers/server/main.py \ --port 8080 \ --architectures Qwen3MoeForCausalLM \ --model_name Qwen3-235B-A22B-Instruct-2507 \ --model_path "/mnt/shared/models/Qwen3-235B-A22B-Instruct-2507-GGUF" \ --gguf_path "/mnt/shared/models/Qwen3-235B-A22B-Instruct-2507-GGUF/Q8_0" \...

[Feature] support Qwen3-235B-A22B-Instruct-2507

> > Please DO NOT ADD --**cache_lens** > > If i do not specify cache_lens then i am restricted to 16k length. how do i specify 256k context length? Yes,...

[Feature] support qwen3-coder-480b-a35b

Tried [UD-Q8 from unsloth/Qwen3-Coder-480B-A35B-Instruct-1M-GGUF](https://huggingface.co/unsloth/Qwen3-Coder-480B-A35B-Instruct-1M-GGUF/tree/main/UD-Q8_K_XL) with no luck.

qwen3 int4 是否能在4090D 24G + 内存：128G的机器上跑起来？

不够的，也不建议。300GB可以。

[Bug] Cannot run Qwen3-30B-A3B at all

python ktransformers/server/main.py \ --port 8080 \ --architectures Qwen3MoeForCausalLM \ --model_name Qwen3-235B-A22B-Instruct-2507 \ --model_path "/mnt/shared/models/Qwen3-235B-A22B-Instruct-2507-GGUF" \ --gguf_path "/mnt/shared/models/Qwen3-235B-A22B-Instruct-2507-GGUF/Q8_0" \ --optimize_config_path ktransformers/optimize/optimize_rules/Qwen3Moe-serve.yaml \ --cpu_infer 32 \ --temperature 0.7 \ --top_p 0.8 \...