EasyEdit icon indicating copy to clipboard operation
EasyEdit copied to clipboard

Evaluation specifics

Open piotrmigdalek opened this issue 1 year ago • 2 comments

Hi!

I'm trying to evaluate Mistral-7b based model with custom locality and portability data. For each of 50 edits I have 6 locality prompts and 2 portability ones.

How should I arange the dicts to feed them into an edit function in that case? Will the variable below feeded to portability_inputs work as intended?

portability_inputs = {
    'english': {
        'prompt': df_port['question_en'].tolist(),
        'ground_truth': df_port['label_en'].tolist()
    },
    'polish': {
        'prompt': df_port['question_pl'].tolist(),
        'ground_truth': df_port['label_pl'].tolist()
    }
}

And a technical one, are the metrics calculated after each edit? If yes, is there an option to evaluate everything on the final model after 50 sequential edits?

Thank you :)

piotrmigdalek avatar May 10 '24 12:05 piotrmigdalek

Q1:

  • Your usage is correct; just ensure that the number of items in the prompts and ground_truth under each dimension, such as "english" and "polish," are consistent.

  • You can also check if the number of metrics recorded in the logs matches the number of input prompts.

Q2:

  • I haven't implemented this feature yet, which allows for unified evaluation after full editing, but you can refer to the pseudocode in this #220. I will improve this feature in the next version. Thank you!

pengzju avatar May 10 '24 16:05 pengzju

Hi, do you have any further questions?

zxlzr avatar May 12 '24 06:05 zxlzr

Nothing as of now, thanks :)

piotrmigdalek avatar May 29 '24 16:05 piotrmigdalek