Update rag llm as judge metric to support llama-3-3-70b model on WML.
Adding support for llama-3-3-70b model from WML.
Hi @piotrhelm - You need to delete the old 3_1 judges jsons from the repo.
/home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/llm_as_judge/binary/llama_3_1_70b_instruct_wml_answer_correctness_q_a_gt_loose.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/llm_as_judge/binary/llama_3_1_70b_instruct_wml_answer_relevance_q_a.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/llm_as_judge/binary/llama_3_1_70b_instruct_wml_correctness_holistic_q_c_a_logprobs.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/llm_as_judge/binary/llama_3_1_70b_instruct_wml_answer_correctness_q_a_gt_loose_logprobs.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/llm_as_judge/binary/llama_3_1_70b_instruct_wml_answer_relevance_q_a_logprobs.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/llm_as_judge/binary/llama_3_1_70b_instruct_wml_faithfulness_c_a_logprobs.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/llm_as_judge/binary/llama_3_1_70b_instruct_wml_faithfulness_q_c_a_logprobs.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/llm_as_judge/binary/llama_3_1_70b_instruct_wml_faithfulness_q_c_a.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/llm_as_judge/binary/llama_3_1_70b_instruct_wml_correctness_holistic_q_c_a.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/llm_as_judge/binary/llama_3_1_70b_instruct_wml_faithfulness_c_a.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/llm_as_judge/binary/llama_3_1_70b_instruct_wml_context_relevance_q_c_ares.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/llm_as_judge/binary/llama_3_1_70b_instruct_wml_context_relevance_q_c_ares_logprobs.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/rag/correctness_holistic/llama_3_1_70b_instruct_wml_q_c_a_logprobs.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/rag/correctness_holistic/llama_3_1_70b_instruct_wml_q_c_a.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/rag/correctness_holistic/llama_3_1_70b_instruct_wml_q_c_a_numeric.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/rag/faithfulness/llama_3_1_70b_instruct_wml_c_a.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/rag/faithfulness/llama_3_1_70b_instruct_wml_q_c_a_logprobs.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/rag/faithfulness/llama_3_1_70b_instruct_wml_q_c_a.json main() File "/home/runner/work/unitxt/unitxt/utils/prepare_all_artifacts.py", line 198, in main raise RuntimeError( RuntimeError: Branch's catalog is different from the total production of branch's prepare files. See details in the logs. /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/rag/faithfulness/llama_3_1_70b_instruct_wml_c_a_logprobs.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/rag/faithfulness/llama_3_1_70b_instruct_wml_c_a_verbal.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/rag/faithfulness/llama_3_1_70b_instruct_wml_q_c_a_verbal.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/rag/answer_correctness/llama_3_1_70b_instruct_wml_q_a_gt_loose_logprobs.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/rag/answer_correctness/llama_3_1_70b_instruct_wml_q_a_gt_loose.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/rag/answer_correctness/llama_3_1_70b_instruct_wml_q_a_gt_loose_numeric.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/rag/context_relevance/llama_3_1_70b_instruct_wml_q_c_ares_logprobs.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/rag/context_relevance/llama_3_1_70b_instruct_wml_q_c_ares_numeric.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/rag/context_relevance/llama_3_1_70b_instruct_wml_q_c_ares.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/rag/answer_relevance/llama_3_1_70b_instruct_wml_q_a.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/rag/answer_relevance/llama_3_1_70b_instruct_wml_q_a_logprobs.json /home/runner/work/unitxt/unitxt/src/unitxt/catalog/metrics/rag/answer_relevance/llama_3_1_70b_instruct_wml_q_a_numeric.json
@yoavkatz Done.
Closing as this is merged -> https://github.com/IBM/unitxt/pull/1948