llm-foundry
llm-foundry copied to clipboard
Configure eval to give 'loss/eval' that is analgous to 'loss/train'
When I run with an eval set, I only get metrics/eval.
I am wondering if there is a way to configure llm-foundry via yaml to also compute loss/eval in the same way that it computes loss/train.
Hi @tginart yes we can definitely add support for that.
Basically I think we just need to copy this code over to eval.py and test it out.
If you'd like to open a PR I'd be happy to review it! Otherwise I'll add it to my list and get to it soon.
Hi @abhi-mosaic.
I am referring to the metrics from the composer Evaluator in the train.py (https://github.com/mosaicml/llm-foundry/blob/3c66b1c5df668e0684548fef30d00669df64636c/scripts/train/train.py#LL158C1-L162C79)
So not sure if we are talking about the same thing?
I'm still running through train.py not eval.py.
Hi @tginart, is this what you're looking for? ^
The eval metrics such as metrics/eval/LanguageCrossEntropyLoss are computed every eval_interval, which I think is defaulted to 500ba. You could certainly reduce this interval but keep in mind that eval metrics are an average over an entire eval dataloader, as opposed to the loss/train value which is the live train loss of a single batch. So it would slow down training a lot.
If you want fine grained eval but not use the whole eval_dataloader, you can also use something like eval_subset_num_batches: 10 which would only run over the first 10 batches of the eval dataloader.
Re. the WandB chart naming, I think you could change the line you linked with label='eval' to something like label='eval/loss', but I believe Composer always suffixes the torchmetric name at the end, so you would end up with a chart titlemetrics/eval/loss/LanguageCrossEntropyLoss
Hi @abhi-mosaic, thank you. That is what I am looking for but I was wondering if it was possible to compute loss/train for every example in the eval set & then get an avg. I would only want to do this at eval time. It seems like the way to do this is to actually extend a metric for composer and make sure it gets passed to here ?
Hi @tginart, yes. If you want to compute a custom metric (e.g. one that matches the loss), you can create the metric and pass it along in the place you described. I know you asked this question a while ago, but has it been resolved?
Closing due to inactivity. Please feel free to reopen/or open a new issue if this is not resolved.