nlp_course icon indicating copy to clipboard operation
nlp_course copied to clipboard

Help with seminar on conversation

Open maiiabocharova opened this issue 2 years ago • 2 comments

Hello, I am trying to adapt the idea of learning on triplets for the classification task on an imbalanced datasets.

I am selecting 1 anchor, 1 positive example from the same class and 1 negative example from random other classes. I want to make the model learn how to embed sentences of same classes closer together and afterwards train SVM or something else to make classification according to embedding received from the trained model.

Can you please suggest what the model's architecture could look like? In course you suggested using several dense layers on top of the pretrained Bert (you also suggested not training Bert embeddings, but just training these dense layers). What should be good output size of the vector if I want to use it later for classification? Maybe 16?

I will be very grateful for suggestions!

P.S. Ребята, вы действительно лучшие, мне Ваш курс очень помог в изучении NLP!

maiiabocharova avatar Nov 27 '21 20:11 maiiabocharova