iamreallyi9 issues

Results 4 issues of


                                            iamreallyi9

相似度分布变化的问题

相同的文本，使用余弦相似度，微调前相似度在0.8左右，微调后相似度0.5左右，发生明显变化。如果只是召回用取topk的话内容变化倒是不大，影响也不大。想咨询下可能的原因是？个人怀疑原因1是neg样本选择导致，neg样本存在伪负样本。2是epoch太多？

how to make the example jpg+obj+npz?

how to make the example jpg+obj+npz?use face3d?

tokenizer的相关问题

你好，请教一下我想用自己的tokenizer【chatglm的tokenizer】来训练bge模型应该怎么做呢？我现在的想法是把【chatglm的tokenizer】改成类似bert-tokenizer，这样可行吗？

![image](https://github.com/user-attachments/assets/db78465d-7f49-4ef6-b830-189b3f06283c) 参数以下： --learning_rate 3e-5 \ --fp16 \ --num_train_epochs 2 \ --per_device_train_batch_size 4 \ --dataloader_drop_last True \ --normlized False \ --temperature 0.02 \ --query_max_len 512 \ --passage_max_len 512 \ --train_group_size 6...

iamreallyi9

相似度分布变化的问题

how to make the example jpg+obj+npz?

tokenizer的相关问题

grad_norm特别大，这样训练正常吗