Junru Shao

Results 179 comments of Junru Shao

We've fixed several related issues in the recent month, but could you guys double check if the issue persists? If so, please open a new issue with detailed information so...

We are not using `tune_relax` because it only supports static shape workloads. Will release a tutorial soon

I'm not sure, but Vicuna-7b as a language model it definitely suffers from potential hallucination

At this moment, this project focuses on single consumer-class GPU, making it possible for everyone to run on their own laptops and phones. We will bring in distributed inference later

We haven't announced Dolly yet, but it should work out of the box as of today. On the issue you reported, I met it once when I didn't compile Metal...

This should work properly on latest HEAD. Please feel free to open new issues if the problem persists :-)

It’s up! https://twitter.com/bohanhou1998/status/1655772690760994818

Hey I made a docker image that may help benchmark MLC LLM performance: https://github.com/junrushao/llm-perf-bench On the other hand, I don’t really think docker is a perfect abstraction for those usecases...