ExplainaBoard Document the concept of `statistics` [Discussion]

Document the concept of `statistics` [Discussion]

Open pfliu-nlp opened this issue 2 years ago • 0 comments

I guess statistics would be an important concept throughout the project, making it well-organized and documented will be both good for developers and users. (Also, caching statistics from different scenarios characterizes one valuable point of ExplainaBoard.) General, there are the following scenarios:

Statistics of trainning set

Purpose: it's costly to calculate the training set dependent features on the fly, we need a cachable object to store the important statistics of a dataset to fasten the calculation of training set dependent features.
Caching strategies:
- [x] store it in local filesystem (e.g., dataset["train"]._stat)
- [x] store it in DB (metadata.statistics)
- [ ] store it in S3 and put the S3 link in DB
SDK function
- [_gen_external_stats](https://github.com/neulab/ExplainaBoard/blob/c27c3391b7090f8be41ac076bc88143ea90623e7/explainaboard/processors/processor.py#L48)

Statistics for Scoring

Purpose: for some metrics (e.g, text generation), we usually need to cache some intermediate statistics for each sample (e.g., n-gram overlaps) so that some downstream applications such as non-composable overall evaluation score, confidence interval, or significance test could be made efficiently.
Caching strategies:
- [ ] store it in an in-memory dict? EaaS will generate it and pass it to ExplainaBoard processor?
SDK function
- `[_gen_scoring_stats]'(https://github.com/neulab/ExplainaBoard/blob/c27c3391b7090f8be41ac076bc88143ea90623e7/explainaboard/processors/conditional_generation.py#L109)

Dataset-dependent statistics (from Datalab)

Purpose: from a DataLab dataset split, get resources necessary to calculate statistics (This is the original description from @neubig , want to know more about your orginial intention.)
Caching:
- [ ] store it in local filesystem (e.g., dataset["split"]._stat, e.g., dataset["split"][feature_name])
SDK function
- _get_statistics_resources

Overall statistics

Purpose: a package of overall statistics information, including performance, of the system output
SDK function
- get_overall_statistics

Fine-grained statistics

Purpose: a package of _bucketing_samples results
SDK function
- get_fine_grained_statistics

@neubig @OscarWang114 (it seems the definition of the last two statistics is a little bit different from the first two, maybe we could have a better way for naming all of them)

Mar 22 '22 20:03 pfliu-nlp

ExplainaBoard ExplainaBoard copied to clipboard

Document the concept of `statistics` [Discussion]

Statistics of trainning set

Statistics for Scoring

Dataset-dependent statistics (from Datalab)

Overall statistics

Fine-grained statistics

ExplainaBoard
ExplainaBoard copied to clipboard