Feature Request: Provide progress callback for loading model from cache
Currently, initProgressCallback only reports progress when the model is being downloaded from the network. However, when the model is already cached in the browser (via CacheStorage), the user receives no feedback, even though loading and initializing the model from cache can still take multiple seconds.
I’d like to request a new callback, e.g., cacheLoadProgressCallback, that reports progress as each cached shard is read and processed.
This would allow developers to show meaningful progress indicators during both cold and warm starts.
Possible implementation ideas:
- Trigger callback as each shard is read from CacheStorage and passed into arrayBuffer()
- Report total bytes loaded from cache vs total expected
- Or simply provide a boolean flag indicating whether loading is from cache or network
Thanks for the awesome work on WebLLM!
I am pretty sure this did work in the past..
Thanks for the issue! If I understand your request correctly this indeed should work. e.g. in chat.webllm.ai, you can see the following response "loading model from cache"
Yes, but the problem is that its always 0%. I guess you could argue that this makes sense since 0 bytes are loaded from the network. But from a user perspective they don't really care if its loaded from the network or from cache. They just want to know the progress.
Ah you're right. Likely a bug in https://github.com/apache/tvm/blob/8a914e58925557741aca6d7453e5d94004254079/web/src/runtime.ts#L1316
Not ideal, @kentcdodds , but as a workaround, you can parse the [xx/yy] from the text. See the _findPercentCompleteFromStatus() function at top of https://github.com/DecentAppsNet/decentapp-template/blob/main/src/loadScreen/interactions/initialization.ts
Created a PR to solve this in apache/tvm, in the meantime a workaround is to used what @erikh2000 posted above