FlagEmbedding issues

What's the difference between encode_corpus and encode?

1

What's the difference between [encode_corpus](https://github.com/FlagOpen/FlagEmbedding/blob/4efa19d7eb6661494280df556cb8e92c9363c4f1/FlagEmbedding/inference/embedder/decoder_only/base.py#L132C9-L132C14) and [encode](https://github.com/FlagOpen/FlagEmbedding/blob/4efa19d7eb6661494280df556cb8e92c9363c4f1/FlagEmbedding/inference/embedder/decoder_only/base.py#L160)?

chansonzhang

encoder类型的模型用的是cls进行相似度分的输出 decoder类型的模型也输出相似度分吗？是怎么输出的？

1

想请问以下几个问题： 1、encoder类型的模型用的是cls进行相似度分的输出 decoder类型的模型也输出相似度分吗？是怎么输出的？ 2、lightweight的含义是什么？ 3、有各个模型的推理速度数据吗，如果是只输出一个分数，2b的模型应该也不会慢吧

ZHAOFEGNSHUN

flagEmbedding做向量化是否自带支持高并发

1

# 请问使用flagEmbedding做向量化的时候，使用bge-m3模型，在高并发下有最佳实践可以参考吗，目前我自己包装了一个使用flagembedding服务来做向量化，在使用多线程的情况下遇到奇奇怪怪的问题，以下是我的代码 ` import os import traceback from concurrent.futures import ThreadPoolExecutor from fastapi import FastAPI, HTTPException from fastapi.responses import JSONResponse from fastapi.encoders import jsonable_encoder import uvicorn import asyncio from log.log_info...

Stefan3Zz

torch.distributed.elastic.multiprocessing.errors.ChildFailedError

``` E1119 08:26:02.715000 28705 site-packages/torch/distributed/elastic/multiprocessing/api.py:882] failed (exitcode: -11) local_rank: 0 (pid: 28773) of binary: /app/anaconda3/envs/py312/bin/python3.12 Traceback (most recent call last): File "/app/anaconda3/envs/py312/bin/torchrun", line 7, in sys.exit(main()) ^^^^^^ File "/app/anaconda3/envs/py312/lib/python3.12/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line...

liwei6677

Activation beacon输出乱码

hi，我对Activation beacon工作非常感兴趣，想尝试使用hf上的例子跑该模型。但是输出结果是乱码。使用的例子： messages = [{"role": "user", "content": "Tell me about yourself."}] 输出： Input Length: 24 Output: 'system\nYou are a helpful assistant.\nuser\nTell me about yourself.\nassistant\n - 10.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000' 我使用的transformer版本是4.45.0，fa版本是2.7.2

bestfleer

Why do the embeddings from my trained BGE-M3 model match those from the original model?

Could there be an issue with the parameter settings in my training script? export WANDB_MODE=disabled train_data="\ /home/jovyan/dataws1/bgeft/train_table_data " # set large epochs and small batch size for testing num_train_epochs=1 per_device_train_batch_size=1...

shyzzz521