Tianqi Chen

Results 637 comments of Tianqi Chen

let us also confirm if it is the case for JSONFFIEngine

Just to followup on the case of JSONFFIEngine. The main purpose of JSONFFIEngine is that we should avoid passing in object and parsing mlc-chat-config from FFI side. so the current...

@MasterJH5574 would be good to confirm the state of this issue now in JSONFFI

the latest MLCEngine should support concurrent generation and read config ones, see #2217

KV cache is a common interface, the solution right now would be to create a difference instance of kv cache implementation of the same interfaceand replace it

This is something ideally we would like to enable, and indeed we need to overcome some of the hurdles mentions. We can keep this issue open to see the status,...

Thanks for reporting. As a temp measure. Reduce the prefill chunk size might help. We should followup by auto limit this number when we run gen config

@ahz-r3v you might need to cross check if you have recompiled the lib

closing as the delivery flow now lands

added https://github.com/mlc-ai/mlc-llm/pull/2445