mlc-llm
mlc-llm copied to clipboard
can you supply more converted models?
https://huggingface.co/wshhyh/mlc_llm-dolly-v2-int4 i have tried to convert dolly,its env is very hard to configure,can you supply your converted models on huggingface for users to download?
I tried the same thing tonight. It started to run but I get a none type error in the triton compiler when it tries to ‘get_amdgpu_arch_fulldetails’. I assume this is due to some amd gpu installation that is a requirement I’m missing, but I couldn’t get to the bottom of it and need to pack it in for the night. I feel like I’m so frustratingly close to having it working.
EDIT: it’s probably because I need rocm.cc as alluded to here: https://github.com/openai/triton/blob/e2ae2c6c483f575f3c0531795d420f907e98b37a/python/triton/compiler/compiler.py#L201
Will try again tomorrow.
To be clear, TVM Unity has both ROCm/Vulkan backend, which means we do not necessarily have to depend on ROCm like what Triton does. At the moment, I believe Vicuna-7b does work with AMD GPUs according to #15.
We are going to release official Dolly support pretty soon. Documentation on customizing model coverage is also on the way. Please stay tuned :-)