DialogTag Make training code available?

I'm interested in reproducing the model in pytorch and am curious how you preprocessed the data and trained it. I didn't see any metrics reported and want to see what those look like as well! So, the training script would be nice to have as well!

Great repo by the way!

Apr 27 '21 18:04 JohnnyC08

+1 @JohnnyC08

It'd also be interesting to see if including context (conversation history / last utterances) improves the accuracy of predictions.

May 11 '21 08:05 creatorrr

@creatorrr That's interesting.

How would you go about doing that? My first thought is using a rolling window and making a single block of text out of the elements of the window and assigning the label to the text block of the last element in the block.

How would you do it?

May 17 '21 20:05 JohnnyC08

@JohnnyC08 I was thinking of something simpler, prepending dialog act labels of the last three utterances to the input vector when finetuning it. For example, take this conversation:

A: Do you want to grab lunch? [Yes-No-Question]
B: Not really. [Dispreferred-answers]
A: Oh okay. [Response-Acknowledgement]
B: How about tomorrow? <<TO PREDICT>>

Then the input vector would be: [CLS] Yes-No-Question [SEP] Dispreferred-answers [SEP] Response-Acknowledgement [SEP] How about tomorrow? [SEP]

Jun 01 '21 07:06 creatorrr

@creatorrr @JohnnyC08 Did either of you end up creating a previous context-dependent model? Also, were you able to successfully predict on a GPU? Loading the model is entirely allocating all my card's memory, suggesting a leak in the loading of the model that is downloaded.

Dec 16 '21 20:12 argideritzalpea

@bhavitvyamalik Thanks again for publishing the model. I think that some comments on how training was conducted would really make this repo more complete.

What are the inputs for training on the SBWA corpus? Are they single sentences or sequences of sentences?

What training scripts were used to train this model?

Any utilities to customize this for another dataset?

What parameters were used for fine-tuning?

What outputs of the DistilBert encoding are used for the classification task?

I am attempting to use this for DA labeling on a conversational dataset and it is giving various and poor results for the same simple sentence "Okay." I assume this is because of dropout and also over or under fitting. Overall I'm not sure that this model gives me confidence required to use for my project as is. If training scripts and the data were released that would be awesome!

Dec 18 '21 17:12 argideritzalpea

@creatorrr @JohnnyC08 Did either of you end up creating a previous context-dependent model? Also, were you able to successfully predict on a GPU? Loading the model is entirely allocating all my card's memory, suggesting a leak in the loading of the model that is downloaded.

Haven’t gotten around to it yet, been really busy but will give it a try one of these weekends @argideritzalpea

Jan 19 '22 01:01 creatorrr

@creatorrr That's interesting.

How would you go about doing that? My first thought is using a rolling window and making a single block of text out of the elements of the window and assigning the label to the text block of the last element in the block.

How would you do it?

Ever got a chance to try this out @JohnnyC08 ?

Jan 19 '22 01:01 creatorrr

Hi, Could you please share the training scripts? Also could you please share the link to the training data?

Aug 21 '22 07:08 hannan72

@JohnnyC08 I ended up training a deberta based dialog act classifer on silicone-merged dataset using sentence pairs (previous utterance, current utterance) and it performs better than single utterances. You can take a look here.

Sep 11 '22 05:09 creatorrr