ConvLab-3 icon indicating copy to clipboard operation
ConvLab-3 copied to clipboard

GlobalWoz DST Dataset

Open nikos-rvnt opened this issue 1 year ago • 0 comments

  • Name: GlobalWoz
  • Description: a large-scale multilingual ToD dataset globalized from an English ToD dataset for three unexplored use cases of multilingual ToD systems.
  • Paper: https://aclanthology.org/2022.acl-long.115.pdf
  • Data: https://github.com/bosheng2020/globalwoz
  • License: what is the license of the dataset
  • Motivation: multi-lingual and multiturn dataset including 20 languages

Checkbox

  • [ ] Create data/unified_datasets/$dataset_name folder, where $dataset_name is the name of the dataset.
  • [ ] Create the dataset scripts under data/unified_datasets/$dataset_name following data/unified_datasets/README.md.
  • [ ] Run python check.py $dataset in the data/unified_datasets directory to check the validation of processed dataset and get data statistics and shuffled dialog ids.
  • [ ] Add the dataset card data/unified_datasets/$dataset_name/README.md following data/unified_datasets/README_TEMPLATE.md.
  • [ ] Upload the data, scripts, and dataset card to https://huggingface.co/ConvLab
  • [ ] Update NOTICE with license information.

nikos-rvnt avatar Jun 28 '24 12:06 nikos-rvnt