ConvLab-2
ConvLab-2 copied to clipboard
[Feature] Update MultiWOZ dataset from 2.1 to 2.2
Given the release of MultiWoZ 2.2, it seems like the baselines should all be retrained using the cleanest version of the dataset. Paper: https://www.aclweb.org/anthology/2020.nlp4convai-1.13/
Thanks! We've noticed MultiWOZ 2.2. We will add it if it is of high quality
Also would be great to support the new format (which will also make it easy to add SGD).
We are planning to add many datasets (SchemaGuided, Taskmaster, etc.) using a unified format.
great that you're planning to add SGD and Taskmaster, any updates on when that will be available
Actually, we have processed SGD, Taskmaster, and other datasets. We will update them with MultiWOZ 2.2 & 2.3 in few days. Thanks!
great stuff - looking forward to it!
@tomolopolis SGD and Taskmaster are available in unified format #180.
@zqwerty thanks for that, are there plans to replicate (some) of the existing supported model implementations to use the unified format? then have the various datasets configurable in each model, given the consistent format?
For example some new modules might be: convlab2/nlu/jointBERT/unified/nlu.py convlab2/dst/comer/unified/dst.py convlab2/policy/gdpl/unified/policy.py convlab2/nlg/sclstm/unified/nlg.py ...
@tomolopolis we will modify the unified data process and support some of the useful models. However, some models have a lot of dataset-specific processes which can not be well unified.
@tomolopolis we have added multiwoz 2.2 and multiwoz-coref. Check 34960ff
in master. However, I deleted the previous commit in order to remove git lfs due to the limited bandwidth for download.
I've noticed that you have merged the previous pull-request. Hope that will not bother you too much.
@zqwerty Thanks for adding those. No worries about deleting the previous commit, I can pull in the latest