web-llm
web-llm copied to clipboard
[embeddings] Any plans for adding other transformer models like sentence-transformers?
Would be really nice to have WebGPU support for running other transformer models like sbert and embeddings models. For example, here's transformer.js
Thanks!
@jinhongyii
Thank you for advice. We are happy to see more and more model support in web-llm. There are already open PR about ChatGLM and Dolly model support. If you are interested in bringing embedding models in, you can take them as a reference and bring up new PR.
Thank you! Are there any specific modifications needed for embeddings models? For example, does the TensorIR technique still apply to embeddings models?
Yes of course embedding can be represented in TensorIR. So basically what you need is to translate the model (pytorch implementation) into corresponding relax operator. If there's no direct translation, write TensorIR manually or write te expressions that can be converted to TensorIR