web-llm icon indicating copy to clipboard operation
web-llm copied to clipboard

[model] StableLM 2 Zephyr 1.6b

Open flatsiedatsie opened this issue 1 year ago • 9 comments

I stumbled upon this two week old discussion here, about the StableLM 2 Zephyr 1.6b model becoming available for web-lmm soon.

https://huggingface.co/stabilityai/stablelm-2-zephyr-1_6b/discussions/9

I'd really love to work with that model, as my testing so far has shown it to work surprisingly well for its size.

Is there any way to use this model already?

flatsiedatsie avatar Feb 19 '24 18:02 flatsiedatsie

I think I just found the files:

https://huggingface.co/OO8/1_6B_dev/tree/main

flatsiedatsie avatar Feb 19 '24 18:02 flatsiedatsie

Got stuck on:

[FATAL] /workspace/mlc-llm/3rdparty/tvm/include/tvm/runtime/packed_func.h:1908: Function tvmjs.array.decode_storage(0: runtime.NDArray, 1: basic_string<char>, 2: basic_string<char>, 3: basic_string<char>) -> void expects 4 arguments, but 3 were provided.
put_char @ web-llm.bundle.mjs:3421

flatsiedatsie avatar Feb 19 '24 21:02 flatsiedatsie

This is likely due to an old version of the web-llm npm (if you are not building from source). If you are building from source, this is likely due to the repo not up to date; try pull the recent changes

CharlieFRuan avatar Feb 21 '24 01:02 CharlieFRuan

It would be fantastic if this model could become part of the default supported models.

The multi-language ability is fantastic. I'm very impressed with it, especially for its size.

flatsiedatsie avatar Mar 15 '24 09:03 flatsiedatsie

Awesome, it seems the model has already become available in the Huggingface repo. The chunks exist:

https://huggingface.co/mlc-ai

However, the .wasm files are missing from binary-mlc-llm-libs. I've created an issue about that.

https://github.com/mlc-ai/binary-mlc-llm-libs/issues/111

flatsiedatsie avatar Apr 05 '24 16:04 flatsiedatsie

Thanks for the request! We should be able to add the prebuilt wasm files in shortly. cc @YiyanZhai

CharlieFRuan avatar Apr 05 '24 16:04 CharlieFRuan

Fantastic! Thank you!

flatsiedatsie avatar Apr 06 '24 13:04 flatsiedatsie

For the record, I think there are more models for which the shards are available, but the wasm files are not (yet).

  • Music
  • ~WizardMath~
  • Gorilla
  • Gemma 7B
  • ~CodeLlama~
  • ~OpenHermes~

flatsiedatsie avatar Apr 06 '24 13:04 flatsiedatsie

Thanks for the list! WizardMath and OpenHermes can reuse the wasm of Mistral (as shown in prebuiltAppConfig in src/config.ts); CodeLlama should be able to reuse that of Llama-2, as long as they share the same quantization (e.g. q4f16_1) and number of params (e.g. 7B or 13B).

CharlieFRuan avatar Apr 10 '24 01:04 CharlieFRuan

Update: StableLM 2 zerphyr 1.6b is now part of the prebuilt app config, since 0.2.39.

For music, this is not intended to be part of the webLLM's prebuilt; runtime is not well-supported for gorilla's function calling yet; Gemma 7B is redundant with https://github.com/mlc-ai/web-llm/issues/357.

Will close this issue for now. Feel free to open new ones!

CharlieFRuan avatar Jun 06 '24 20:06 CharlieFRuan