ExplainaBoard icon indicating copy to clipboard operation
ExplainaBoard copied to clipboard

All datasets used in ExplainaBoard 1.0 should be supported by DataLab SDK

Open pfliu-nlp opened this issue 2 years ago • 1 comments

If necessary, we can introduce the concept of version (e.g., explainaboard) and represent it as sub_dataset_name, for example,

dataset = load_dataset("sst2", "explainaboard")

pfliu-nlp avatar May 04 '22 04:05 pfliu-nlp

aspect-based-sentiment-classification

  • [x] laptop14
  • [x] restaurant14
  • [x] restaurant16
  • [x] twitter

chunking

  • [x] conll00_chunk
  • [ ] conll03_chunk

word-segmentation

  • [ ] as
  • [ ] cityu
  • [ ] ckip
  • [ ] ctb
  • [x] msr
  • [ ] ncc
  • [ ] pku
  • [ ] sxu

named-entity-recognition

  • [x] conll2003
  • [ ] conll2000
  • [ ] ontonotes_ner + notebc
  • [ ] ontonotes_ner + notebn
  • [ ] ontonotes_ner + notemz
  • [ ] ontonotes_ner + notenw
  • [ ] ontonotes_ner + notetc
  • [ ] ontonotes_ner + notewb

text-classification

  • [x] atis
  • [x] cr
  • [x] dbpedia_14
  • [ ] imdb
  • [x] mr
  • [x] qc
  • [x] sst2
  • [x] sst5
  • [x] subj

text-pair-classification

  • [x] snli
  • [x] sick

text-summarization

  • [x] cnn_dailymail (we probably need a new version number)
  • [x] xsum (we probably need a new version number)

XTREME (?)

@neubig do we need to consider it now? I found by using the mapping between DataLab's dataset -> sub_dataset, it would be not very difficult for us to build a composite leaderboard. For example, one simple method is simply to put sub-datasets belonging to the same dataset together with tab or others as a separator.

WMT (?)

pfliu-nlp avatar May 04 '22 05:05 pfliu-nlp