Pengfei Liu

Results 42 issues of Pengfei Liu

If necessary, we can introduce the concept of version (e.g., `explainaboard`) and represent it as `sub_dataset_name`, for example, ``` dataset = load_dataset("sst2", "explainaboard") ```

For example, by running ``` explainaboard --task summarization --custom_dataset_paths ./data/system_outputs/cnndm/cnndm_mini-dataset.tsv --system_outputs ./data/system_outputs/cnndm/cnndm_mini-bart-output.txt --metrics rouge2 ``` it took more than 15 seconds, which I guess is relatively slower?

efficiency

## The time is almost ripe to consider transferring resources from ExplainaBoard 1.0 to 2.0 ### Purposes: * ExplainaBoard 2.0 could be enhanced by introducing system outputs collected from ExplainaBoard...

enhancement
new-analysis
new-functionality

## Purpose: * we can provide some out-of-box functions so that users can use them to convert the system output’s format from A to B, where * A: could the...

enhancement

### Purpose: * ExplainaBoard 2.0 should cover all features that ExplainaBoard 1.0 -XTRME has. ### Methodology * Re-format system outputs: how to format system outputs? * Benchmark leaderboard: How to...

enhancement
new-task
new-analysis

After discussion with Kiril and Haris, the idea of enabling ExplainaBoard to record hyper-parameter features for each task comes out. To achieve this, the potential workflow is: * design a...

new-functionality

new-functionality

It would be nice if users can export figures that could be directly used in their papers based on the generated analysis reports. - [ ] supported by SDK -...

new-functionality

- [ ] NQ (https://ai.google.com/research/NaturalQuestions/dataset) - [ ] TriviaQA (https://nlp.cs.washington.edu/triviaqa/) datasets - [ ] HotpotQA - [ ] DROP

new-task

I guess `statistics` would be an important concept throughout the project, making it well-organized and documented will be both good for developers and users. (Also, caching statistics from different scenarios...