mlc-llm icon indicating copy to clipboard operation
mlc-llm copied to clipboard

Universal LLM Deployment Engine with ML Compilation

Results 578 mlc-llm issues
Sort by recently updated
recently updated
newest added

I absolutely love the idea of this repo, and am very hopeful about its future. I loved it so much that I managed to get the download off testflight. However,...

type: trouble shooting

When trying this out, https://mlc.ai/web-llm/#chat-demo, it gave me the following error. My chrome is 112.0.5615.137 and in the setting no update option is there. ``` Find an error initializing the...

Hi there! Im new to programming. I really want to try and implement ai to my own little program. For example chatgpt has api with i can talk using python...

feature request

Hi everyone, We are looking to gather data points on running MLC-LLM on different hardwares and platforms. Our goal is to create a comprehensive reference for new users. Please share...

help wanted

Does this not support AMD GPUs? I'm getting this error: ``` terminate called after throwing an instance of 'tvm::runtime::InternalError' what(): [18:53:29] /home/runner/work/utils/utils/tvm/src/runtime/vulkan/vulkan_instance.cc:111: --------------------------------------------------------------- An error occurred during the execution of...

question

Just wanted to report that this works perfect on my gtx1060 (6gb) on my old i5-7200 16gb ram under win10. So far, i never reached such a speed with all...

feature request

RWKV Raven 7B Gradio DEMO: https://huggingface.co/spaces/BlinkDL/Raven-RWKV-7B CPU INT4: https://github.com/saharNooby/rwkv.cpp 100% CUDA version: https://github.com/harrisonvanderbyl/rwkv-cpp-cuda ONNX convertor: https://github.com/harrisonvanderbyl/rwkv-onnx Github project: https://github.com/BlinkDL/ChatRWKV Please let me know if you have any questions :)

type: feature request

I think the project has too little information on adjusting some config. Like how to load different weights apart from demo provided? How to adjust temp? I do not have...

documentation

Awesome project, thanks! Does it support sharding large models across multiple GPUs, or would this be in scope for this project in the future?

feature request

https://huggingface.co/h2oai Better than dolly for 12B for example, trained on OASST data.

feature request