FlagEmbedding
FlagEmbedding copied to clipboard
Onnx Support for BGE-M3
Hey there, thank you guys for sharing the model as well. I was curious, are there any plans to support onnx for this model?
We currently do not plan to release an official ONNX version. You can refer to some online documentation to convert the model to ONNX. Additionally, we highly welcome contributions from the community.
Hi, I just published ONNX version with scripts to do the ONNX conversion here: https://huggingface.co/aapot/bge-m3-onnx
Hi, I just published ONNX version with scripts to do the ONNX conversion here: https://huggingface.co/aapot/bge-m3-onnx
Many thanks @aapot for sharing this!!
Hi, I just published ONNX version with scripts to do the ONNX conversion here: https://huggingface.co/aapot/bge-m3-onnx
Thanks for your work. It seems like a cpu version, right?
Hi, I just published ONNX version with scripts to do the ONNX conversion here: https://huggingface.co/aapot/bge-m3-onnx
Thanks for your work. It seems like a cpu version, right?
Yep it's converted for CPU but you can also convert to ONNX GPU using the provided script and the device argument: https://huggingface.co/aapot/bge-m3-onnx#export-onnx-weights
Hi, I just published ONNX version with scripts to do the ONNX conversion here: https://huggingface.co/aapot/bge-m3-onnx
Thanks for your work. It seems like a cpu version, right?
Yep it's converted for CPU but you can also convert to ONNX GPU using the provided script and the device argument: https://huggingface.co/aapot/bge-m3-onnx#export-onnx-weights
how to set gpu id in infer code
`import onnxruntime as ort from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("BAAI/bge-m3") ort_session = ort.InferenceSession("model.onnx")
inputs = tokenizer("BGE M3 is an embedding model supporting dense retrieval, lexical matching and multi-vector interaction.", padding="longest", return_tensors="np") inputs_onnx = {k: ort.OrtValue.ortvalue_from_numpy(v) for k, v in inputs.items()}
outputs = ort_session.run(None, inputs_onnx)`