agents icon indicating copy to clipboard operation
agents copied to clipboard

Allow multiple evaluation criteria to be used by the LLM judge.

Open esteinholtz-cloudera opened this issue 6 months ago • 5 comments

The Eval section of the notebook has been updated to enable multiple evaluation criteria.

sample:

Effectiveness in resolving the conflict Rank 1: gemini-2.0-flash Rank 2: gpt-4o-mini Rank 3: llama3

Clarity of argument Rank 1: gemini-2.0-flash Rank 2: gpt-4o-mini Rank 3: llama3

Creativity of solution Rank 1: llama3 Rank 2: gemini-2.0-flash Rank 3: gpt-4o-mini

TODO: Evaluation criteria could be LLM-generated rather than a fixed set

esteinholtz-cloudera avatar Jun 30 '25 16:06 esteinholtz-cloudera

hey - thanks for this great contribution @esteinholtz-cloudera - it looks like you might have saved the notebook with Outputs there, it's about 1,700 lines of code. Would you be able to resubmit with the outputs cleared? Thanks so much Ed

ed-donner avatar Jul 06 '25 14:07 ed-donner

oops, missed that. Should be cleaned now

esteinholtz-cloudera avatar Jul 10 '25 12:07 esteinholtz-cloudera

Hmm @esteinholtz-cloudera and it seems that there are some updates to folders outside community_contributions - was that intentional? Would you be able to move all updates within community_contributions? Thanks so much Ed

ed-donner avatar Jul 12 '25 18:07 ed-donner

Any other files committed outside community contributions was unintentional.

I have tried a couple of remedies, including creating a new branch, but for some reason, those other changes are persisted in that branch as well....(?). I swear I only added the community contributions this time ... 😢 .

I added git lfs along the way - that might have been the reason it gets screwed. This is above my skill level in git...

Can you include only the community_contributions file in the merge? It is all there.

esteinholtz-cloudera avatar Jul 14 '25 09:07 esteinholtz-cloudera

Ahh unfortunately @esteinholtz-cloudera I don't believe GitHub allows me to make a partial merge.. https://chatgpt.com/share/687afbae-cc80-8012-a2b3-02658dc8b026

ed-donner avatar Jul 19 '25 01:07 ed-donner