Tianqi Chen comments

Results 637 comments of


                                            Tianqi Chen

Question: Persisted chat

this is now supported through full OAI API

Documentation Gap: Questions not able to find on github / docs

Thanks for asking, we have a standalone inference engine that contains a mixed wasm and javascript

Documentation Gap: Questions not able to find on github / docs

each wasm would account for one model type(aka all llama2 variants that fits within certain ctx length and vocab limt). We are working on revamping a framework that also makes...

Documentation Gap: Questions not able to find on github / docs

the model will be cached in the browser cache. It runs on the frontend fully without backend support

Mismatched and missing conversation templates between `web-llm` and `mlc_llm`

this is now fixed and all convo templates are standardized

Chat demo does not work on Android because of maxStorageBufferBindingSize

thanks @beaufortfrancois for bringing it up! Unfortunately this is a limit that we would need help from Chrome side to lift. This is mainly because the model itself do require...

Chat demo does not work on Android because of maxStorageBufferBindingSize

Thanks for the note! let us look a bit into it and see if it is possible to get a variant of small model that fit into this limit. In...

Chat demo does not work on Android because of maxStorageBufferBindingSize

that model depends on shader-f16 feature, which only existed in chrom canary (and not chrome stable AFAIK), so not sure if it works on android. If it is possible to...

Chat demo does not work on Android because of maxStorageBufferBindingSize

It is possible that the model crashes in Llama, we can hit VRAM limit in Llama2 models(that goes beyond 4GB) when using 4 bit quantization, and in iOS we had...

Chat demo does not work on Android because of maxStorageBufferBindingSize

Glad that 3B model works, this is the first running example of webgpu native LLM on mobile phone AFAIK, thank you @beaufortfrancois for pushing this. Love to share this with...