Baber Abbasi
Baber Abbasi
Updated the warning for `gen_kwargs`. The previous one said "these settings will be used over set parameters in yaml tasks" but we only _updated_ the dict though? Not sure which...
Added come drafts in the `Readme`! Also moved the table to the bottom, thought this big (and increasingly growing!) thing broke the flow of the document. Let me know what...
Sounds good! Shouldn't this be merged after #1167? Looks like most of the workarounds here won't be needed after that. Also thinking about it, not really sold on the `predict_only`...
> I think it's still valuable to have! For example, in the Llemma sympy-checked math tasks, for Maj@K at high K, doing the scoring actually takes way more wall-clock time...
The Winogrande results on OpenLLM are 5-shot. Were your evals also 5-shot @JeevanBhoot? That could suggest something changed after `b281b09` in the few-shot split implementation if you're getting the same...
@haileyschoelkopf check this when you get a chance. I think we should consider something like Google sheets or just csv. Markdown on its own might be too cluttered, esp with...
@haileyschoelkopf Might have missed this!
This should have been fixed in #1229. Are you on the latest commit?
hmm. Can you provide the full command? The previous bug occurred only when using batch "auto".
The second one looks like a tokenizer bug. @haileyschoelkopf