Clémentine Fourrier

Results 43 issues of Clémentine Fourrier

## Issue encountered At the moment, we assume one single script per language (which is incorrect). ## Solution/Feature The Language enum needs to be updated to include the script, and...

good first issue
feature/enhancement

Hi dear community! We discovered recently that 20% of our lighteval users are from China 😮 ✨ ... and we cover a really big amount of tasks in Chinese, thanks...

good first issue
help wanted
new-task

## Evaluation short description Bias/hallucination eval ## Evaluation metadata Provide all available - Paper url: https://www.giskard.ai/knowledge/good-answers-are-not-necessarily-factual-answers-an-analysis-of-hallucination-in-leading-llms - Dataset url: https://huggingface.co/datasets/giskardai/phare

good first issue
help wanted
new-task

Adds the option to use the new AsyncVLLM from vllm v1. It supports DP + PP/TP, but not setting the batch size, and deploys an independent async VLLM model which...

feature/enhancement

## Issue encountered, solution We used to be able to run multinode evaluations with lighteval, it needs to be tested and added back. Careful with https://github.com/huggingface/lighteval/pull/481

feature/enhancement

Hi! Super cool work! I'm a researcher at HuggingFace working on evaluation and leaderboards. I understand that this cool eval suite is first and foremost there to evaluate use cases...

Potential idea: remove suites entirely and just add suite to the task name if needed

## Describe the bug EvaluationTracker.save() will fail at `dataset = Dataset.from_list([asdict(detail) for detail in task_details])` with ``` Exception has occurred: ArrowInvalid (note: full exception trace is shown but execution is...

bug

Atm, need to use custom tasks to launch them, must be documented

## Issue encountered : https://jiaweizzhao.github.io/deepconf/static/htmls/code_example.html cc @lewtun

science-team