Clémentine Fourrier issues

Results 43 issues of


                                            Clémentine Fourrier

Collate items in GenerativeTaskDataset by similar EOS token

bug

To remember for version upgrades

Theoretically everything up to 1.0.0 is considered unstable and prone to change at any time: > Major version zero (0.y.z) is for initial development. Anything MAY change at any time....

to keep

`--no_multichoice_continuations_start_space` should also cover startof word token

If the tokenizer prepends `_` as sow token, it will make single token evals fail. Reported by @anton-l

bug

More variety in prompt support

At the moment, we support chat templates (need to be edited for multichoice), but not CoT. Could be cool to add.

feature request

Need to reupload TruthfulQA

Atm, following the harness, TruthfulQA hardcodes the few shot samples. We should instead reupload the dataset with the few shot samples on the side, and use our normal mechanism for...

bug

Finish the clean up

- Add more docs - Move os.environ["TOKENIZERS_PARALLELISM"] = "false" to the main scripts.

documentation

Allow passing a post normalisation function

See for example https://github.com/wiskojo/lm-evaluation-harness/blob/60c3d381b893b164be0d919d3e9992a6c0fe6ce3/lm_eval/tasks/ifeval/instructions.py

feature request

Lib license

Hi, What's the license of your library?

GenAI Arena - publication pushed back

Ready for a light review

Programmatic interface + cleaner management of requests

This PR does 2 things: - introduce a programmatic interface with a Pipeline object which should allow users to call the models more easily (also removes the evaluator since most...