SWDA seems to give lot of repetition in the sample responses for the test data.
Context 4-1: ('yeah ', 2)
Context 5-0: ("and the people in the city were saying well why should i go do that make the government do that that ' s not my job ", 27)
Context 6-1: ("right they ' ve got a lot of adjustments to make with coming out of what they ' ve been through ", 22)
Context 7-1: ('now and ', 3) Context 8-1: ("they don ' t understand that to make that work they ' ve got to take some responsibility for themselves it ' s not just the government ' s responsibility anymore ", 32)
Target >> you can't just blame it on the government when they give you the freedom to take care of yourself then that puts some responsibility on you as well
Sample 0 >> it the their their their their their their the she she she she she she she she'she'i she'i she'she'i she'she'i'' she'
Sample 1 >> yeah
Sample 2 >> it is is is but she is
Sample 3 >> the the high school is high high school system
Sample 4 >> and but it is
Sample 5 >> these are just
Sample 6 >> in their of their high of their life is worth of an life life life life life life life life life life life life life life life life life life life life life life life life life life life life
Sample 7 >> it in their their their something is something is something is something something something her life life something something something something something something something something something something something something something something something something something something something something something something something
Sample 8 >> but but but i but i i'i i
Sample 9 >> but and but of their name of their life and their life just never just
Is there some way to avoid this?
Using beam search can avoid this. We will upload our latest version which uses beam search.
Okay thanks a lot , but can you explain what is happening in the model that is causing this?
The SWDA dataset has low quality with repetitions which greedy decoding may be sensitive to. However, our final results should be better than what you showed. Did you finish all epochs? Because in the beginning, the repetitions is very severe but as the training goes it becomes better.
Yes , all the 100 epochs were done.Even in case of DialogDial , the repetition is less within the samples compared to SWDA but if we see the overall samples generated across the dataset , the response samples generated are very similar. I tried the same model on the Microsoft Frames dataset , same thing is being observed. For multiple batches , the generated samples are almost same.