Junru Shao comments

Results 179 comments of


                                            Junru Shao

Docker support

Update: I opened up a repo (https://github.com/junrushao/llm-perf-bench) of dockerfiles to help reproduce cuda performance numbers. The takeaway is: MLC LLM is around 30% faster than Exllama. I’m not a docker...

Error:VK_ERROR_INCOMPATIBLE_DRIVER

WSL support for Vulkan is not there yet AFAIK, so please use CMD instead if you want to use Vulkan on Windows

huggingface.co: no such host

> dial tcp: lookup Would you mind checking the network connection?

Android support

It’s up! https://twitter.com/bohanhou1998/status/1655772690760994818

How to `tune_relax` for other targets

`tune_relax` isn't something we are currently using to tune LLMs, because it only supports static shape workloads. Instead we are using a mixed strategy that allows dynamic shape workloads as...

Update Speed features on Mac

macbooks in recent years all ship with M1/M2 GPUs, which are fairly capable of running LLM workloads

[WIP] Convenient script for auto tuning

Agreed that "tuning" is a pretty overloaded term - in this particular case, I am referring to "auto-tuning compiler", which is the key to GPU performance. With TVM Unity auto...

very slow in Mac

Macbooks ship with M1/M2 GPUs are quite capable of running those LLM workloads, but indeed Intel integrated ones with previous macbooks are far behind

it can not stop when it speak Chinese

I'm a native Chinese speaker, but have to admit that I don't fully understand non-ASCII encoding...@spectrometerHBH is an expert in this! > To avoid garbled text in your CMD command...

Does it support iPhone 13

RedPajama-3b should work