Clémentine Fourrier

Results 43 issues of Clémentine Fourrier

Theoretically everything up to 1.0.0 is considered unstable and prone to change at any time: > Major version zero (0.y.z) is for initial development. Anything MAY change at any time....

to keep

If the tokenizer prepends `_` as sow token, it will make single token evals fail. Reported by @anton-l

bug

At the moment, we support chat templates (need to be edited for multichoice), but not CoT. Could be cool to add.

feature request

Atm, following the harness, TruthfulQA hardcodes the few shot samples. We should instead reupload the dataset with the few shot samples on the side, and use our normal mechanism for...

bug

- Add more docs - Move os.environ["TOKENIZERS_PARALLELISM"] = "false" to the main scripts.

documentation

See for example https://github.com/wiskojo/lm-evaluation-harness/blob/60c3d381b893b164be0d919d3e9992a6c0fe6ce3/lm_eval/tasks/ifeval/instructions.py

feature request

Hi, What's the license of your library?

Ready for a light review

This PR does 2 things: - introduce a programmatic interface with a Pipeline object which should allow users to call the models more easily (also removes the evaluator since most...