multiwoz
multiwoz copied to clipboard
Evaluation Dataset in "Response Generation" - "end-to-end models" part
Is this part used multiwoz 2.2 or multiwoz 2.0 as a benchmark dataset. I'm so confused, in RewardNet, Mars, KRLS original paper, all results are the same as your table, but they all reported in multiwoz 2.0 dataset. Morever, in the TOATOD paper, authors reported combined score in multiwoz 2.2 dataset. Is there any mistakes. Can you explain this inconsistent. Thanks !
Is this part used multiwoz 2.2 or multiwoz 2.0 as a benchmark dataset. I'm so confused, in RewardNet, Mars, KRLS original paper, all results are the same as your table, but they all reported in multiwoz 2.0 dataset. Morever, in the TOATOD paper, authors reported combined score in multiwoz 2.2 dataset. Is there any mistakes. Can you explain this inconsistent. Thanks !
I have the same doubt.