luqman-openai
luqman-openai
Thanks for opening this PR, character level reasoning and counting is a well-known failure mode of the model due to a common underlying issue in LLMs. In its current form,...
Thanks for opening this PR, arithmetic and other complex calculations are hard for the model to do zero-shot, without a chance to reason through the steps or to use tools...
Thanks for contribution. It seems the implementation of `evals.elsuite.2truths1lie:Truths2Lie1` seems to be missing in the PR. We're not accepting Evals that have custom code implementations at this moment (but we...
Closing the PR due to inactivity and incomplete files like `idioms/few_shot.jsonl`, please reopen if you get a chance to complete this PR.
Closing for now, please feel free to reopen if you get a chance to address the comments.
Thank you for opening this PR, we're not accepting Evals that have custom code implementations at this moment (but we are accepting custom model-graded evals). If possible, could you rewrite...
Closing for now, please feel free to reopen if you get a chance to address the comments.
Thanks for opening this PR, Character level reasoning and counting is a well-known failure mode of the model due to a common underlying issue in LLMs. In its current form,...
Closing for now, please feel free to reopen if you get a chance to address the comments.
Closing for now, please feel free to reopen if you get a chance to address the comments.