fastembed [Model Request]: Support for BGE-M3 embedding model

What happened?

I would really appreciate it if the BGE-M3 model is supported including dense sparse and colbert vectors.

What Python version are you on? e.g. python --version

Python 3.11

Version

0.2.7 (Latest)

What os are you seeing the problem on?

Linux

Relevant stack traces and/or logs

No response

Sep 23 '24 14:09 JPC612

+1 from us - we think that would greatly improve quality and performance of the embeddings, especially in multilingual applications.

Oct 16 '24 07:10 BobMiles

BGE-M3 requires an extension of our interface, which we are going to start working on this feature soon

Oct 16 '24 09:10 joein

any follow ups?

May 21 '25 04:05 rong-xyz

@JPC612 Hi! I've implemented a solution to support BGE-M3 model locally. I created two patches:

A patch for the ONNX embedding model registration to support BGE-M3
A patch for the model management system to handle local model paths

Here's my implementation that allows using the local BGE-M3 model:


"""
FastEmbed Patch for Local Model Support

This module provides patches for FastEmbed to support local ONNX models, specifically BGE-M3.
It includes patches for both model registration and model loading from local paths.

Author: Jiaohuix
Date: 2025/05/21
"""

import os
import functools
from pathlib import Path
from typing import Any, TypeVar, Type, Optional, List

from fastembed import TextEmbedding
from fastembed.common.model_description import DenseModelDescription, ModelSource
from fastembed.text.onnx_embedding import OnnxTextEmbedding, supported_onnx_models
from fastembed.text.text_embedding_base import TextEmbeddingBase
from fastembed.common.model_management import ModelManagement

# Constants
LOCAL_MODEL_PATH = "/home/jiaohuix/Projects/pretrained_models/bge-m3"
MODEL_ID = "BAAI/bge-m3"

# Type definitions
T = TypeVar("T")


def create_bge_m3_model_description() -> DenseModelDescription:
    """
    Create a model description for BGE-M3.

    Returns:
        DenseModelDescription: The model description object.
    """
    return DenseModelDescription(
        model="BAAI/bge-m3",
        dim=1024,
        description=(
            "Text embeddings, Unimodal (text), Chinese, 512 input tokens truncation, "
            "Prefixes for queries/documents: not so necessary, 2024 year."
        ),
        license="apache-2.0",
        size_in_GB=1.2,
        additional_files=[],
        sources=ModelSource(
            hf=None,
            url=LOCAL_MODEL_PATH,
            _deprecated_tar_struct=False
        ),
        model_file="onnx/model.onnx",
    )


def patch_download_model(cls: Type[ModelManagement]) -> None:
    """
    Patch the ModelManagement class to support local model paths.

    Args:
        cls (Type[ModelManagement]): The ModelManagement class to patch.
    """
    original_download_model = cls.download_model

    @classmethod
    @functools.wraps(original_download_model)
    def patched_download_model(cls, model: T, cache_dir: str, retries: int = 3, **kwargs: Any) -> Path:
        """
        Modified download_model method that supports local model paths.

        Args:
            model (T): The model description.
            cache_dir (str): The path to the cache directory.
            retries (int): The number of times to retry (including the first attempt).
            **kwargs: Additional keyword arguments.

        Returns:
            Path: The path to the downloaded model directory.
        """
        local_files_only = kwargs.get("local_files_only", False)
        specific_model_path: Optional[str] = kwargs.pop("specific_model_path", None)
        if specific_model_path:
            return Path(specific_model_path)

        # Check if url_source is a local path
        url_source = model.sources.url
        if url_source and os.path.exists(url_source):
            print(f"Model is already exists locally: {url_source}")
            return Path(url_source)

        # Call original method
        return original_download_model(model, cache_dir, retries, **kwargs)

    # Replace original method
    cls.download_model = patched_download_model


def register_bge_m3_model() -> None:
    """
    Register the BGE-M3 model to the supported models list.
    """
    bge_m3_model = create_bge_m3_model_description()
    supported_onnx_models.insert(0, bge_m3_model)


def apply_patches() -> None:
    """
    Apply all necessary patches to support local BGE-M3 model.
    """
    # Apply ModelManagement patch
    patch_download_model(ModelManagement)
    # Register BGE-M3 model
    register_bge_m3_model()


def create_test_documents() -> List[str]:
    """
    Create a list of test documents.

    Returns:
        List[str]: A list of test documents.
    """
    return [
        "这是一个测试文档，用于验证模型是否正常工作。",
        "这是第二个测试文档，用于验证批处理功能。",
    ]


def test_embedding() -> None:
    """
    Test the embedding functionality with the BGE-M3 model.
    """
    documents = create_test_documents()

    # Initialize and test the model
    print("Supported models:", OnnxTextEmbedding._list_supported_models())
    embedding_model = TextEmbedding(model_name=MODEL_ID)
    print(f"The model {MODEL_ID} is ready to use.")

    # Generate embeddings
    embeddings_generator = embedding_model.embed(documents)
    print(embeddings_generator)

    # Convert to list and print results
    embeddings_list = list(embedding_model.embed(documents))
    print(f"Embedding dimension: {len(embeddings_list[0])}")
    print(f"First 10 values of first embedding: {embeddings_list[0][:10]}")


if __name__ == "__main__":
    # Apply all patches
    apply_patches()
    # Run the test
    test_embedding()
    '''
    The model BAAI/bge-m3 is ready to use.
    <generator object TextEmbedding.embed at 0x7da0295be5f0>
    Embedding dimension: 1024
    First 10 values of first embedding: [-0.05687739 -0.01121791 -0.06061672 -0.02679417 -0.00059405 -0.02982275
    0.04113888  0.03449325  0.01062364  0.01557624]
    '''

The solution works by:

Adding BGE-M3 to the supported ONNX models list
Modifying the model download logic to handle local paths
Using the local model path instead of downloading from HuggingFace

Note: Before using this code, you need to:

Download the BGE-M3 model from HuggingFace
Place it in your local directory (e.g., /home/jiaohuix/Projects/pretrained_models/bge-m3)
Make sure the model files are in the correct structure (with onnx/model.onnx)

I've tested it and it works well with the local BGE-M3 model. The embeddings are generated correctly with 1024 dimensions.

Would this be helpful for the official implementation? Let me know if you need any clarification or have questions about the implementation.

May 21 '25 08:05 jiaohuix

BGE-M3 requires an extension of our interface, which we are going to start working on this feature soon

@joein Are there still plans to extend the interface to support BGE-M3?

Jun 04 '25 16:06 ReadyPlayerEmma

Are there any updates regarding this thread? I am really interested in testing this library but i need BGE-M3 model for my purposes.

Dec 11 '25 14:12 cartantino