Pengfei Liu issues

Results 42 issues of


                                            Pengfei Liu

All datasets used in ExplainaBoard 1.0 should be supported by DataLab SDK

If necessary, we can introduce the concept of version (e.g., `explainaboard`) and represent it as `sub_dataset_name`, for example, ``` dataset = load_dataset("sst2", "explainaboard") ```

Summarization evaluation using `rouge2` is quite slow

For example, by running ``` explainaboard --task summarization --custom_dataset_paths ./data/system_outputs/cnndm/cnndm_mini-dataset.tsv --system_outputs ./data/system_outputs/cnndm/cnndm_mini-bart-output.txt --metrics rouge2 ``` it took more than 15 seconds, which I guess is relatively slower?

efficiency

Inherited wealth from ExplainaBoard 1.0 - Single task

## The time is almost ripe to consider transferring resources from ExplainaBoard 1.0 to 2.0 ### Purposes: * ExplainaBoard 2.0 could be enhanced by introducing system outputs collected from ExplainaBoard...

enhancement

new-analysis

new-functionality

Interface for data format conversion

## Purpose: * we can provide some out-of-box functions so that users can use them to convert the system output’s format from A to B, where * A: could the...

enhancement

Inherited wealth from ExplainaBoard 1.0 - Multilingual Benchmark

### Purpose: * ExplainaBoard 2.0 should cover all features that ExplainaBoard 1.0 -XTRME has. ### Methodology * Re-format system outputs: how to format system outputs? * Benchmark leaderboard: How to...

enhancement

new-task

new-analysis

Pengfei Liu

All datasets used in ExplainaBoard 1.0 should be supported by DataLab SDK

Summarization evaluation using `rouge2` is quite slow

Inherited wealth from ExplainaBoard 1.0 - Single task

Interface for data format conversion

Inherited wealth from ExplainaBoard 1.0 - Multilingual Benchmark

ExplainaBoard maintains the hyper-parameter features?

multi-lingual evaluation?

Exportable Visualization of ExplainaBoard Analysis Report

Support more QA tasks?

Document the concept of `statistics` [Discussion]