Siyuan Feng comments

Results 99 comments of


                                            Siyuan Feng

Update Speed features on Mac

Actually, it performs well on Mac, please see the performance reports at https://github.com/mlc-ai/mlc-llm/issues/15. I guess you are using Mac with Intel integrated GPU. If so, we can do nothing as...

very slow in Mac

We use GPUs (also Integrated GPUs) to run models. As `Intel UHD Graphics 630` is too weak, we can do nothing to break the hardware limitations

Can you give the code and guide for all models conversion?

All codes, including model conversion codes, are in this repo. We are going to write tutorials soon

Can you give the code and guide for all models conversion?

As Relax has native dynamic shape support and modulized compilation flow, we won't waste our time on Relay in this project. And we are constructed by relax, because: 1. We...

[Bug] Run RedPajama model failed on MacOS

Could you try to uninstall and reinstall `mlc-chat-nightly`?

[Bug] Run RedPajama model failed on MacOS

Thanks for your bug reporting. It's just fixed in #316. You can have try after tomorrow's nightly build

It Just Seemingly Crashes....?

Please try RedPajama-INCITE-Chat-3B-v1, which is more friendly to 4G VRAM

[Question] Whether to support running on ARM64v8

Please see the [documents](https://mlc.ai/mlc-llm/docs/tutorials/runtime/cpp.html#id2) here, need to run `gen_cmake_config.py` to detect device

[Question] Whether to support running on ARM64v8

You are right. The CPU is too weak to run LLMs, so we only focus on the GPU environment due to limited bandwidth. On the other hand, it's not hard...

[Dlight] Scheduling Low batch GEMM using GEMV-like rule

Would be great if you could add testcases :)