JaonLiu

Results 72 comments of JaonLiu

> Thanks for your request! We will try to support that soon. Any good news?

Same problem!when use Qwen2!But for Qwen1.5, it can work!

@csukuangfj Can you please share a detailed tutorial? The tutorial on https://k2-fsa.github.io/sherpa/onnx/sense-voice/export.html#the-code is not very detailed.

@sunzj does the Speculative Decoding Mode can been used in Android ?

> Likely the self speculating models like eagle would help in this case @tqchen How to use the Eagle inference acceleration on an Android phone with MLC-LLM? Thanks a lot!

> Likely the self speculating models like eagle would help in this case @tqchen when use eagle, it seems need to train a draft model~

> a few hundred MB must be the CPU memory usage. However, the model is stored in the GPU memory, so OS-level memory command is not enough @Hzfengsy Is there...