ExplainaBoard
ExplainaBoard copied to clipboard
All datasets used in ExplainaBoard 1.0 should be supported by DataLab SDK
If necessary, we can introduce the concept of version (e.g., explainaboard
) and represent it as sub_dataset_name
, for example,
dataset = load_dataset("sst2", "explainaboard")
aspect-based-sentiment-classification
- [x] laptop14
- [x] restaurant14
- [x] restaurant16
- [x] twitter
chunking
- [x] conll00_chunk
- [ ] conll03_chunk
word-segmentation
- [ ] as
- [ ] cityu
- [ ] ckip
- [ ] ctb
- [x] msr
- [ ] ncc
- [ ] pku
- [ ] sxu
named-entity-recognition
- [x] conll2003
- [ ] conll2000
- [ ] ontonotes_ner + notebc
- [ ] ontonotes_ner + notebn
- [ ] ontonotes_ner + notemz
- [ ] ontonotes_ner + notenw
- [ ] ontonotes_ner + notetc
- [ ] ontonotes_ner + notewb
text-classification
- [x] atis
- [x] cr
- [x] dbpedia_14
- [ ] imdb
- [x] mr
- [x] qc
- [x] sst2
- [x] sst5
- [x] subj
text-pair-classification
- [x] snli
- [x] sick
text-summarization
- [x] cnn_dailymail (we probably need a new version number)
- [x] xsum (we probably need a new version number)
XTREME (?)
@neubig do we need to consider it now? I found by using the mapping between DataLab's dataset -> sub_dataset
, it would be not very difficult for us to build a composite leaderboard. For example, one simple method is simply to put sub-datasets belonging to the same dataset together with tab
or others as a separator.