Pinzhen "Patrick" Chen issues

Results 4 issues of


                                            Pinzhen "Patrick" Chen

Different scores from different COMET package versions 1.1.2 and 2.2.1

## 🐛 Bug When the same source, target, reference files are evaluated using the same wmt22-comet-da checkpoint, `unbabel-comet 2.2.1` under `python3.9` and `unbabel-comet 1.1.2` under `python3.7` gave me dramatically different...

bug

Automatically derive filters based on a clean sample provded by the user.

In practice I would have big noisy training data and sample clean data that is representative of the downstream task (e.g. wmt validation sets). It is still difficulty for me...

add a separater between selected fillters and the filter pool

ATM there is no boundary between the select ones and available ones. Also some filters only need to be used once (like "remove whitespace"). maybe when they are selected, they...

enhancement

Regarding metric chrf's implementation

Hi, In this repo, the `chrf` metric [implementation](https://github.com/EleutherAI/lm-evaluation-harness/blob/ebe7226ebfb8d11a9fb8d6b53eb65891f895c633/lm_eval/api/metrics.py#L92C1-L103C52) calls `sacrebleu.corpus_chrf()` with default [parameters](https://github.com/mjpost/sacrebleu/blob/0f351010b8b641aaa59fe75b98d7cc522bf221eb/sacrebleu/compat.py#L94): character order 6 and word order 0. Perhaps in `metric.py` it would be nice to include those...

feature request

good first issue