mlc-llm
mlc-llm copied to clipboard
Add support for downloading weights from HF path
This PR allows us to pass in a HF URL (for example, lmsys/vicuna-7b-delta-v1.1
) to build.py
so the user does not need to manually download weights.
I encountered this problem
how can I resolve it
I faced the same error. Please guide us on how to solve it
(mlc-llm-env) ~/w/s/mlc-llm ❯❯❯ python build.py --hf-path=mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0 ✘ 1
Updated git hooks.
Git LFS initialized.
Cloning into 'dist/models/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0'...
remote: Enumerating objects: 69, done.
remote: Total 69 (delta 0), reused 0 (delta 0), pack-reused 69
Unpacking objects: 100% (69/69), 605.86 KiB | 953.00 KiB/s, done.
Filtering content: 100% (51/51), 1.46 GiB | 12.65 MiB/s, done.
Downloaded weights to dist/models/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0
Traceback (most recent call last):
File "/Users/james/workspace/sources/mlc-llm/build.py", line 416, in <module>
ARGS = _parse_args()
^^^^^^^^^^^^^
File "/Users/james/workspace/sources/mlc-llm/build.py", line 76, in _parse_args
parsed = _setup_model_path(parsed)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/james/workspace/sources/mlc-llm/build.py", line 146, in _setup_model_path
validate_config(args.model_path)
File "/Users/james/workspace/sources/mlc-llm/build.py", line 179, in validate_config
assert os.path.exists(
AssertionError: Model path must contain valid config file.
@dfqddd @namchuai , the hf-path
must refer to a pre-compiled model while all models under mlc-ai are already compiled (we upload the output directory of python build.py
to https://huggingface.co/mlc-ai).
For example, if you want to compile Redpajama, you should try:
python build.py --hf-path=togethercomputer/RedPajama-INCITE-Instruct-3B-v1 --quantization q4f16_0
rather than
python build.py --hf-path=mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0
Because the latter one is compilation output.
Thank you @yzh119 for clarify things out. But I still have below error
(mlc-llm-env) ~/w/s/mlc-llm ❯❯❯ python build.py --hf-path=mlc-ai/demo-vicuna-v1-7b-int4 --quantization q4f16_0 --target android --max-seq-len 768
Updated git hooks.
Git LFS initialized.
Cloning into 'dist/models/demo-vicuna-v1-7b-int4'...
remote: Enumerating objects: 276, done.
remote: Counting objects: 100% (276/276), done.
remote: Compressing objects: 100% (275/275), done.
remote: Total 276 (delta 1), reused 272 (delta 0), pack-reused 0
Receiving objects: 100% (276/276), 43.60 KiB | 7.27 MiB/s, done.
Resolving deltas: 100% (1/1), done.
Filtering content: 100% (133/133), 3.53 GiB | 11.98 MiB/s, done.
Downloaded weights to dist/models/demo-vicuna-v1-7b-int4
Traceback (most recent call last):
File "/Users/james/workspace/sources/mlc-llm/build.py", line 416, in <module>
ARGS = _parse_args()
^^^^^^^^^^^^^
File "/Users/james/workspace/sources/mlc-llm/build.py", line 76, in _parse_args
parsed = _setup_model_path(parsed)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/james/workspace/sources/mlc-llm/build.py", line 146, in _setup_model_path
validate_config(args.model_path)
File "/Users/james/workspace/sources/mlc-llm/build.py", line 179, in validate_config
assert os.path.exists(
AssertionError: Model path must contain valid config file.
Oh I think I was stupid. I'm using compiled one. Sorry
I have already mentioned, all models under mlc-ai (including the mlc-ai/demo-vicuna-v1-7b-int4 you used) are already compiled by MLC-LLM, and you should find some pre-compiled models in raw huggingface format (like togethercomputer/RedPajama-INCITE-Instruct-3B-v1 I mentioned).