DialogTag icon indicating copy to clipboard operation
DialogTag copied to clipboard

Details of pretrained model

Open angoodkind opened this issue 4 years ago • 6 comments

Can you provide further details about the pretrained model? Is it using any context, etc. for the utterances? It would be really helpful if there was a paper I could point to.

angoodkind avatar Mar 15 '22 21:03 angoodkind

I tried bert-base-uncased, distilbert-base-uncased and bert-large-uncased. The difference between these models was around 1-1.5 F1 score each with bert-large-uncased performing best. However, I feel it was perfect case of overfitting. bert-base-uncased should be sufficient for this problem. I framed it as a multi-class classification problem by classifying sentences from around 38 intents.

If you are planning to work on it, you can look at existing solutions here (https://nlpprogress.com/english/dialogue.html)

bhavitvyamalik avatar Mar 16 '22 12:03 bhavitvyamalik

So are you just classifying utterances based on the semantics of the utterances itself, in isolation? Or is any consideration of prior contest taken into account?

On Wed, Mar 16, 2022 at 7:03 AM Bhavitvya Malik @.***> wrote:

I tried bert-base-uncased, distilbert-base-uncased and bert-large-uncased. The difference between these models was around 1-1.5 F1 score each with bert-large-uncased performing best. However, I feel it was perfect case of overfitting. bert-base-uncased should be sufficient for this problem. I framed it as a multi-class classification problem by classifying sentences from around 38 intents.

If you are planning to work on it, you can look at existing solutions here (https://nlpprogress.com/english/dialogue.html)

— Reply to this email directly, view it on GitHub https://github.com/bhavitvyamalik/DialogTag/issues/6#issuecomment-1069052070, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKAMWFZFJSZT3U2NLK7DP3VAHEYFANCNFSM5QZ6HK6Q . You are receiving this because you authored the thread.Message ID: @.***>

angoodkind avatar Mar 16 '22 13:03 angoodkind

Further, what kind of model did you use when training? I understand it was a multi-class classification problem, but what was the training process? Thanks!

angoodkind avatar Mar 17 '22 00:03 angoodkind

This is similar to a lot of the questions raised in #2

angoodkind avatar Mar 17 '22 00:03 angoodkind

Just following up on this. I would like to cite this library in a paper I am publishing. Can you please provide more details, at least with the type of model you used to train the classifier?

angoodkind avatar Apr 22 '22 16:04 angoodkind

Hi @angoodkind, Apologies for the delayed response. As mentioned in my comment previously,

I tried bert-base-uncased, distilbert-base-uncased and bert-large-uncased. The difference between these models was around 1-1.5 F1 score each with bert-large-uncased performing best. However, I feel it was perfect case of overfitting. bert-base-uncased should be sufficient for this problem. I framed it as a multi-class classification problem by classifying sentences from around 38 intents.

The model you used depends on how you called the API model = DialogTag('distilbert-base-uncased'), it calls the model with finetuned weights of model name you provided. Since it was a multi-class classification problem, I used CrossEntropyLoss as my loss function for ground truth intent and predicted intent.

bhavitvyamalik avatar Apr 23 '22 16:04 bhavitvyamalik