llm-foundry
llm-foundry copied to clipboard
How to support new ICL task types in own codebase
Hi, I want to add a custom ICL task type which corresponds to a new ICL metric in my codebase. Currently we have created our own patch of the training entry point file [link].
I was considering monkey patching the code for mapping the new task type to a new metric for this method However I will have to add support for a new DataLoader corresponding to the new ICL task as well.
Is there any recommendation around how I should support new ICL task types? I would prefer to not maintain my own fork of the MosaicML repo is possible.
Hey @sanjari-orb, admittedly this is not the most straightforward thing to do at the moment. We're working on some refactors that would make it easier.
You will need to make your own copy of train.py, but I don't think you need to fork anything else.
I think this is what is required:
(1) copy over train.py
(2) Add a line to add your metric to the model's eval metrics (https://github.com/mosaicml/llm-foundry/blob/ddba5c8d7395223a80019e1a2db90af753317195/llmfoundry/models/mpt/modeling_mpt.py#L958)
(3) Copy the ICL evaluator building function into your modified train.py, and add code to create your Evaluator. To avoid a fork of composer, you may need to copy a bit of code from composer into that function as well.
However I want to be able to specify a suite of evaluation tasks, like tasks_light.yaml and add sections like
# In-context learning evaluation
icl_tasks: <>
icl_subset_num_batches: 100 # -1, or omit this key entirely, to evaluate on all batches
eval_gauntlet: '<>'
icl_seq_len: 1024
to my mcli finetuning yaml. Thus, I would like to reuse the preexisting datasets of tasks_light.yaml and add my own to it. I need to have 3rd party dataset-specific evaluations (eg, jeopardy, wikipedia) and this data can look very different from the train/validation split of the training loop. My understanding is that model's eval metrics (https://github.com/mosaicml/llm-foundry/blob/ddba5c8d7395223a80019e1a2db90af753317195/llmfoundry/models/mpt/modeling_mpt.py#L958) will not work on anything except for the validation split provided to the trainer.
The eval metrics on the model are used for all Evaluators, which the ICL tasks are.
But how will the evaluator know that the new metric I introduce should only be computed on data corresponding to my new task, and not all the icl tasks? I thought that was the kind of mapping the following was doing: https://github.com/mosaicml/llm-foundry/blob/54cbb8bd1007b28cd3fe2dc59a34fa298ba6c12e/llmfoundry/utils/builders.py#L213-L237
Ah, in (3) from my original message, you'll be constructing the evaluator yourself, so you can pass the metric name you want.
I see, got it. I was patching the evaluator building code but wanted to verify if there was another way or not. Thank you for confirming!
We're working on some refactors that would make it easier.
Is there any ETA on when the refactoring will be merged btw? It would be great to have support for adding new kinds of evaluation tasks natively.
Yeah, I was just suggesting that instead of monkeypatching the library itself, you just copy it over to your train.py script, so that you only have to make changes to the launcher script.
something like
def custom_build_icl_evaluators(...):
if task_name in my_task_names:
call your custom code
else:
call the original function from the library
and sorry, I can't give an ETA on that right now.
Understood. Thanks a lot!
Closing as this is now a registry.