Tianqi Chen
Tianqi Chen
this likely is a rccl issue and not something mlc project can do about, so closing it for now. love to know if there are followup findings and we love...
@sheepHavingPurpleLeaf do you mind try to create a python script that reproduces the error? likely you can do that through ```python from openai import OpenAI from mlc_llm.serve import PopenServer def...
we are moving towarda a fully OAI compatble API, which hopefully allows some customizations in systems . You can use LM chat template which is mostly raw
we have seen seevral examples working on older cards, likely we just need to turnoff flash infer, cutlass, and also follow instruction to build tvm from source
This should be fixed
If you see mlc llm command not found. Try to run python -m mlc_LLM usually it is due to multiple python in env
Thanks for pointing this out. I think we can certainly enhance this behavior
@bayley do you know how these multiple system prompt get interpreted into prompt specifically? Most chat template follows a system then user/assistant alternation
Right now we will implement the support via concat all system messsages
make sure you installed mlc_llm. on windows we recommend running through a conda env