DeepPavlov icon indicating copy to clipboard operation
DeepPavlov copied to clipboard

facilities to fine-tune BERT sentence embedder

Open vcjob opened this issue 4 years ago • 0 comments

Hello everyone!

I use sentence_multi_cased_L-12_H-768_A-12_pt to extract sentence embeddings. I have prepared texts for fine-tuning, but not sure where to start and what to do. I saw huggingface scripts and train API, but not sure if the deeppavlov model will suit. I also saw legacy script for a regular BERT model finetuning, but I guess this is no go with sentence_multi_cased_L-12_H-768_A-12_pt, or I'm mistaken? What's the right way to finetune the model? Maybe I better give up on this one and choose another model that can be used for sentence embedding extraction and can be easily finetuned?

Thanks a lot!

vcjob avatar May 24 '21 16:05 vcjob