mlc-llm
mlc-llm copied to clipboard
How to run this using python?
I build the model ok, but don't know how to run it using python.
python tests/chat.py ??? how to config it? It runs fail.
Hi @sleepwalker2017, thanks for trying out the project. Are you trying to just run the chat bot, or build from source? If you are just trying to run it, please follow the instructions here. If you are trying to build from source, please follow the instructions here.
seems I run it ok, by modifying some python codes.
here is what I did:
- python build.py --hf-path=databricks/dolly-v2-3b
- Add a line to tests/chat.py :
args.add_argument("--model", type=str, default="auto")
- run this cmd:
python tests/chat.py --artifact-path dist --model dolly-v2-3b --quantization q3f16_0 --max-gen-len 50
Am I right??
I find the response of the model is non-sense.
Hi @sleepwalker2017, thanks for trying out the project. Are you trying to just run the chat bot, or build from source? If you are just trying to run it, please follow the instructions here. If you are trying to build from source, please follow the instructions here.
I post it , is that normal ?
@sudeepag
Hi @sleepwalker2017, tests/chat.py is currently being used for debugging. We are planning on supporting a Python app soon – we will have a PR up within this week. In the meantime, please use the CLI or the iOS / Android instructions to run on the corresponding platforms.
OK thank you! Could you please give a detailed usage of the python or cli in the README file?
https://github.com/mlc-ai/mlc-llm/tree/main/python