delphiRo issues

Results 3 issues of


                                            delphiRo

[Bug] Segfault after last update. Before the update the gemma3 model works fine

## 🐛 Bug Segfault after last update. Before the update the gemma3 model works fine ## To Reproduce ROCR_VISIBLE_DEVICES=2 python -m mlc_llm serve HF://mlc-ai/gemma-3-27b-it-q4f16_1-MLC --port 8081 --overrides "tensor_parallel_shards=1;max_total_seq_length=2768;gpu_memory_utilization=0.92;" --mode server...

bug

[Feature Request] ROCM 6.3 nightly build

## 🚀 Feature Need to recent and actual support of Instinct AMD cards Increase the total serving actual capacity I need to use the latest rocm for gfx906

feature request

[Bug] qwen3-14b_q4f16 on GPU cause 100% of CPU when input request become more then 1500 tokens. Inference cause to become forever. (tested on CUDA,ROCM)

## 🐛 Bug ## To Reproduce I have a very strange situation with MLC LLM. Tested on qwen3-14b_q4f16 When I increase the input token length request up to 1500-2500tokens then...

bug