flair Feat: Multi-GPU Evaluation

Flair now supports multi-GPU training, but not evaluation. This means that the work of n-1 GPUs is wasted during evaluation time, and this can dramatically reduce the benefit of multi-GPU training if your eval set is considerable in size. Even worse, I believe it can be slower than single GPU evaluation, as CPU portions of the evaluation code have to repeat n times, but are sharing the same CPU and memory resources.

This PR implements multi-GPU acceleration for evaluate in the Classifier, TextRegressor, and TextPairRegressor model types. It uses the DistributedSampler to split the eval set between the GPUs, predictions are run, and the results of inference are aggregated between processes before the metrics are calculated in the main process and returned.

Feb 05 '25 06:02 MattGPT-ai

Looks like checks are hitting an unrelated type error: 4256: error: Argument 1 to "Entity" has incompatible type "tuple[Optional[int], int]"; expected "tuple[int, int]" [arg-type]

Feb 05 '25 18:02 MattGPT-ai

@MattGPT-ai this is due to a new mypy version and affected a deprecated class. I just fixed it in #3613. If you update this branch to current master the error should disappear.

Feb 05 '25 20:02 alanakbik

Awesome that worked, checks passed!

Feb 06 '25 00:02 MattGPT-ai

We have been using this change successfully in our fork for about a month now, it's been a major speed improvement especially when evaluation sets are large!

Mar 04 '25 23:03 MattGPT-ai