blue-vision0 issues

Results 3 issues of


                                            blue-vision0

训练时提示“Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.”，推理时报错找不到tokenzier

训练的时候就有提示 “Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained” ![image](https://github.com/FlagOpen/FlagEmbedding/assets/89055561/8718e027-c1fa-403a-ba71-d41b1f86ea68) 推理的时候，使用From_pretrained_tokenizer的时候会报错os.error，找不到文件。经过排查，发现是保存的finetune过的模型中，tokenizer_config.json文件与基座模型的tokenizer_config.json不同。 ![image](https://github.com/FlagOpen/FlagEmbedding/assets/89055561/697fdf0a-8a13-4a1d-ab70-ed49e7aeb9d5) 似乎表明其依赖原来基座模型的tokenizer_config.json文件。因为我训练和部署的机器不同，所以找不到这个文件就报错了。当我把基座模型的文件放到对应位置，模型才能正常使用。感觉这是一个BUG，我之前用早期版本的bge脚本训练的时候并没有这个问题，tokenizer_config文件并不会改变。我使用的库版本如下： sentence-transformers 2.2.2 transformers 4.34.0...

关于CLS和MEAN_POOLING的问题

请问在使用embedding模型作为向量召回的场景中，为什么大家都默认是使用CLS作为最后的返回结果，而不是使用MEAN_POOLING或者FIRST_LAST_AVG等其它呢？有数据表明CLS是大多数场景的最优吗？作者是怎样看待这个问题的呢？

如何基于transformers库自定义模型？

![image](https://github.com/jsksxs360/How-to-use-Transformers/assets/89055561/c24385e3-ff55-4b9d-99ee-5fb7a6395dc3) 我试了一下您这个写法，确实可以从from_pretrained方法中加载一些已经预训练过的一些模型，但是前提是这个地方的参数名称一定叫做self.bert 如果我把这一行的名称改写为self.bert_model等等其他的名称，就统统无法用from_pretrained方法成功加载权重了。这样是不是太死板了，有什么别的方法解决吗？