pptod icon indicating copy to clipboard operation
pptod copied to clipboard

About DST on delexicalized response

Open Leezekun opened this issue 3 years ago • 3 comments

Hi, this is very great work. Congrats on being accepted by ACL2022.

But I have a question about the DST model. It seems that the DST model is trained and evaluated on delexicalized response. However, some slot values are mentioned in the non-delexicalized system responses. How can the model predict these slots correctly if it is trained and evaluated using delexicalized responses?

Thanks!

Leezekun avatar Apr 21 '22 02:04 Leezekun

Hi, this is very great work. Congrats on being accepted by ACL2022.

But I have a question about the DST model. It seems that the DST model is trained and evaluated on delexicalized response. However, some slot values are mentioned in the non-delexicalized system responses. How can the model predict these slots correctly if it is trained and evaluated using delexicalized responses?

Thanks!

Hi,

Thank you for your interest in our work. Actually, we only focus on the delexicalized part of DST prediction as following previous studies. I assume the accuracy of the model on non-delexicalized slots cannot be well guaranteed due to the nature of our training and evaluation. One way to improve this might be switching the training and evaluation to the non-delexicalized format of the data.

Best,

Yixuan

yxuansu avatar Apr 21 '22 07:04 yxuansu

Hi, thanks for the reply.

Did you mean that the comparison results between your model and other models are all trained and evaluated on delexicalized response (Table 4 and 5)? I have tried training and evaluating the model on non-delexicalized response and the performance seems better.

Thanks

Leezekun avatar Apr 21 '22 17:04 Leezekun

Hi, thanks for the reply.

Did you mean that the comparison results between your model and other models are all trained and evaluated on delexicalized response (Table 4 and 5)? I have tried training and evaluating the model on non-delexicalized response and the performance seems better.

Thanks

Yes, that's right. Our model is evaluated on the delexicalized responses. It is quite interesting to know that the model can perform better on non-delexicalized responses :-)

yxuansu avatar Apr 21 '22 18:04 yxuansu