Charlie Ruan
Charlie Ruan
### Overview This PR supports warp-level shuffle primitives using the newly introduced `subgroup` in WebGPU. We then use them in the implementation of allreduce lowering. The introduced primitives are: -...
Roadmap
### Function calling - [ ] Integrate with XGrammar's structural tag: https://github.com/mlc-ai/xgrammar/pull/162, and enable reliable tool use with small models in WebLLM - [ ] Add an E2E MCP-like example,...
This PR updates the web-xgrammar TypeScript package, mainly adding support for Structural Tag. There are two components to this PR: 1. Address trivial changes to the APIs, including adding `any_whitespace`...