JaonLiu
JaonLiu
wish +10086
> Thanks for your request! We will try to support that soon. Any good news?
Same problem!when use Qwen2!But for Qwen1.5, it can work!
same error when use GRPO
@csukuangfj Can you please share a detailed tutorial? The tutorial on https://k2-fsa.github.io/sherpa/onnx/sense-voice/export.html#the-code is not very detailed.
@sunzj does the Speculative Decoding Mode can been used in Android ?
> Likely the self speculating models like eagle would help in this case @tqchen How to use the Eagle inference acceleration on an Android phone with MLC-LLM? Thanks a lot!
> Likely the self speculating models like eagle would help in this case @tqchen when use eagle, it seems need to train a draft model~
> a few hundred MB must be the CPU memory usage. However, the model is stored in the GPU memory, so OS-level memory command is not enough @Hzfengsy Is there...