lighteval icon indicating copy to clipboard operation
lighteval copied to clipboard

[FT] Single token completion loglikelihood auto-detection

Open hynky1999 opened this issue 1 year ago • 0 comments

Issue encountered

  • If all choices for loglikehood task are exactly one token, one can only run single pass to compute their logprobs. This is the case of MCF formulation (A/B/C) of tasks, which is the most used. However currently if one wants to leverage this fast evaluation, he needs to use special metric variant (metric_single_token). This is not only annoying to maintain, but many users don't know about this and don't benefit from potential speed-up.

Solution/Feature

We could detect the single token case automatically during loglikehood requests computation.

  1. Group loglikehood requests by context
  2. From each group select those that have exactly one token
  3. Run the single_token requests from each group using the single-token workflow.

Benefits

Huge speed-up all mcq tasks. Easier to maintain models and metrics (no need to create and handle single token variants of metrics)

hynky1999 avatar Oct 10 '24 11:10 hynky1999