llm-foundry Tessa/callibration script

Tessa/callibration script

Open tbarton16 opened this issue 1 year ago • 2 comments

Here is code we use to test our benchmark tasks by using a series of progressively more advanced models to see if the benchmarks effectively differentiate between them, and at which number of shots they performed best at.

Select an independent variable and a series of models that correspond to the settings of that variable
Select clusters
Edit the list of tasks in the base_callibration.yaml to reflect the ones you want to see
Run the launcher script
When everything is done, run the analyze_output notebook which collates the results from wandb

Feb 02 '24 23:02 tbarton16

lgtm! I kinda hate checking in notebooks but I do think it's better than a script in this case.

Feb 05 '24 22:02 maxisawesome

Would you mind adding the MCLI name of a test run you launched so I can go back and describe run and view logs later?

Additionally a screenshot of the resulting notebook would be good so that when I go back to this later I can confirm that I got the correct results?

Feb 09 '24 18:02 bmosaicml

llm-foundry llm-foundry copied to clipboard

Tessa/callibration script

llm-foundry
llm-foundry copied to clipboard