web-llm-chat [Feature Request]: Allowing customization of model/wasm

Solution Description

We need to be mindful of allowing customization of model/wasm, e.g, allow advanced users to provide their own set of app config that adds on to our builtin, this way users can upload run their own models

Alternatives Considered

No response

Additional Context

No response

May 20 '24 09:05 Neet-Nestor

Hi, is this possible now? I have tried to add a custom model to the model_list of the appConfig, but it will always return an error if the given custom model is not already present in the prebuilt app config.

I also read about the support of custom models through the MLC-LLM REST API. And maybe I am misunderstanding this feature, but wouldn't this mean the model is not being run locally within the browser but hosted on a server instead? Or will the webllm-chat client grab the model from the api endpoint and run it locally?

Nov 18 '24 17:11 PabloYG

@PabloYG Thanks for following up. I hasn't put this as my priority after supporting support of mlc-llm serve but I see why it might not be sufficient now. I can prioritize this work next and make a release soon.

You are right that hosting custom models using mlc-llm serve starts a local server and the app is communicating with the API endpoints directly instead of hosting the model in-browser locally.

I'm thinking of the following two implementations:

Allow users uploading models to HuggingFaces then add model to WebLLM-Chat via HF url
Allow users to upload models directly from local computer

I will start with the 1st one as it's an easier one, and delay the 2nd one. Please let me know if this meets your need or you have any other suggestion.

Nov 18 '24 19:11 Neet-Nestor

While the first one is definitely easier and helpful, the second one will be useful for situations where the whole thing needs to be disconnected / airgapped from any internet/SaaS repositories or APIs. This can be helpful for enterprises that deal with PII/health records or are otherwise sensitive/confidential. This can also be useful for those devices that do not have good connectivity always. Like in remote rural/wild areas or edge devices in the field.

On Tue, 19 Nov, 2024, 12:49 am Nestor Qin, @.***> wrote:

@PabloYG https://github.com/PabloYG Thanks for following up. I hasn't put this as my priority after supporting support of mlc-llm serve but I see why it might not be sufficient now. I can prioritize this work next and make a release soon.

You are right that hosting custom models using mlc-llm serve starts a local server and the app is communicating with the API endpoints directly instead of hosting the model in-browser locally.

I'm thinking of the following two implementations:

Allow users uploading models to HuggingFaces then add model to WebLLM-Chat via HF url

Allow users to upload models directly from local computer

I will start with the 1st one as it's an easier one, and delay the 2nd one. Please let me know if this meets your need or you have any other suggestion.

— Reply to this email directly, view it on GitHub https://github.com/mlc-ai/web-llm-chat/issues/16#issuecomment-2483907736, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFVJFMYAQXCXGVTQIENOXD2BI4VFAVCNFSM6AAAAABSAHB25SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIOBTHEYDONZTGY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Nov 19 '24 08:11 scorpfromhell

Hey @Neet-Nestor thanks for the quick reply. The first option sounds like lower hanging fruit yes, although I agree with @scorpfromhell the second implementation could be incredibly useful. In fact, I suppose the ideal implementation would be an agnostic model loader, I could imagine some companies or researchers wanting to download models from their own hosting as well as uploading them from a local machine. That might fall outside of the scope of this app though, but figured I'd suggest it.

About implementation 1, I suggest adding a link to the documentation directly in the UI and making it clear to the user that the model repo in HF must comply with the standards set in that doc. If the HF url does not point to a compatible model repo, the MLCEngine will throw a somewhat cryptic Cache error when attempting to fetch. It took me a while to realize my HF repo was not properly set up.

Hope that's useful and thanks!

Nov 19 '24 14:11 PabloYG

About implementation 1, I suggest adding a link to the documentation directly in the UI and making it clear to the user that the model repo in HF must comply with the standards set in that doc. If the HF url does not point to a compatible model repo, the MLCEngine will throw a somewhat cryptic Cache error when attempting to fetch. It took me a while to realize my HF repo was not properly set up.

It's definitly helpful, thanks for the comments from both of you!

Loading local model files may require changes to the web-llm package itself, thus I will still first introduce the custom models via HF urls. But I will definitely keep local model files a must-do in the roadmap.

Nov 20 '24 06:11 Neet-Nestor