mlc-llm icon indicating copy to clipboard operation
mlc-llm copied to clipboard

[SLM] Support BERT architecture. Implement a text embedding module

Open rickzx opened this issue 9 months ago • 1 comments

This PR supports text embedding in MLC-LLM with a BERT encoder-only model.

Example usage: https://github.com/rickzx/mlc-llm/blob/18aa7ee378b826a61ce4baa98e4bab1bf3d64038/python/mlc_llm/embeddings/embeddings.ipynb

rickzx avatar Apr 29 '24 20:04 rickzx

This is a good first step towards embedding support through python-level API, would be great to also think about what does it take to bring it as part of the ThreadEngine, in which case we do need to support multiple models, but also have opportunity to support it as a universal embedding endpoint

tqchen avatar Apr 30 '24 00:04 tqchen

please fix the jenkins here

tqchen avatar May 07 '24 21:05 tqchen

please fix the jenkins here

Should be addressed by https://github.com/mlc-ai/mlc-llm/pull/2292. I'm triggering a rebuild now

rickzx avatar May 07 '24 21:05 rickzx

To fix CUDA error, https://github.com/apache/tvm/pull/16982

rickzx avatar May 08 '24 21:05 rickzx