FlagEmbedding icon indicating copy to clipboard operation
FlagEmbedding copied to clipboard

Onnx Support for BGE-M3

Open vinchg opened this issue 1 year ago • 6 comments

Hey there, thank you guys for sharing the model as well. I was curious, are there any plans to support onnx for this model?

vinchg avatar Jan 31 '24 21:01 vinchg

We currently do not plan to release an official ONNX version. You can refer to some online documentation to convert the model to ONNX. Additionally, we highly welcome contributions from the community.

staoxiao avatar Feb 01 '24 08:02 staoxiao

Hi, I just published ONNX version with scripts to do the ONNX conversion here: https://huggingface.co/aapot/bge-m3-onnx

aapot avatar Feb 16 '24 12:02 aapot

Hi, I just published ONNX version with scripts to do the ONNX conversion here: https://huggingface.co/aapot/bge-m3-onnx

Many thanks @aapot for sharing this!!

SpirosMakris avatar Feb 16 '24 14:02 SpirosMakris

Hi, I just published ONNX version with scripts to do the ONNX conversion here: https://huggingface.co/aapot/bge-m3-onnx

Thanks for your work. It seems like a cpu version, right?

shiningliang avatar Feb 18 '24 09:02 shiningliang

Hi, I just published ONNX version with scripts to do the ONNX conversion here: https://huggingface.co/aapot/bge-m3-onnx

Thanks for your work. It seems like a cpu version, right?

Yep it's converted for CPU but you can also convert to ONNX GPU using the provided script and the device argument: https://huggingface.co/aapot/bge-m3-onnx#export-onnx-weights

aapot avatar Feb 19 '24 07:02 aapot

Hi, I just published ONNX version with scripts to do the ONNX conversion here: https://huggingface.co/aapot/bge-m3-onnx

Thanks for your work. It seems like a cpu version, right?

Yep it's converted for CPU but you can also convert to ONNX GPU using the provided script and the device argument: https://huggingface.co/aapot/bge-m3-onnx#export-onnx-weights

how to set gpu id in infer code

`import onnxruntime as ort from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("BAAI/bge-m3") ort_session = ort.InferenceSession("model.onnx")

inputs = tokenizer("BGE M3 is an embedding model supporting dense retrieval, lexical matching and multi-vector interaction.", padding="longest", return_tensors="np") inputs_onnx = {k: ort.OrtValue.ortvalue_from_numpy(v) for k, v in inputs.items()}

outputs = ort_session.run(None, inputs_onnx)`

ZTurboX avatar Jul 12 '24 02:07 ZTurboX