Hongyi Jin comments

Results 27 comments of


                                            Hongyi Jin

Add Dropdown Menu for Selecting Specific LLM Model

Thanks for your suggestion. We will definitely support this feature after we enable more models.

Run Web LLM on a Linux on Android?

I'm not sure about this. You can have a try

Is it possible to run on a 4GB memory GPU?

It's coming soon. Our team is testing running vicuna within 4gb memory internally and will make it public soon

Is it possible to run on a 4GB memory GPU?

Try out our latest project https://github.com/mlc-ai/mlc-llm. You can run a model within 4gb memory constraints in native runtime. We will support 4gb llm on web later

[embeddings] Any plans for adding other transformer models like sentence-transformers?

Thank you for advice. We are happy to see more and more model support in web-llm. There are already open PR about ChatGLM and Dolly model support. If you are...

[embeddings] Any plans for adding other transformer models like sentence-transformers?

Yes of course embedding can be represented in TensorIR. So basically what you need is to translate the model (pytorch implementation) into corresponding relax operator. If there's no direct translation,...

How to migrate from 7b-0 to 7b-1

https://github.com/mlc-ai/web-llm/issues/19#issuecomment-1518940773

Can ndarray-cache.json also be cached?

Thanks for the issue. There's some additional information I need to know. Is it only the json file get requested from huggingface or json+shards all get requested?

Can ndarray-cache.json also be cached?

I've checked that it's doable to skip the step. Will fix this in incoming weeks. If you find this issue urgent, I can show you the related code and you...

Support StableLM

it seems interesting to get StableLM in, but it's not at the top priority among the models we are going to support. Our next model to support is Dolly, and...