evaluate icon indicating copy to clipboard operation
evaluate copied to clipboard

Add support for two input columns for TextClassificationEvaluator

Open fxmarty opened this issue 3 years ago • 4 comments
trafficstars

This PR:

  • Refactors the docstrings to avoid duplicates in the Evaluator subclasses
  • Put all arguments in the compute() signature for subclasses of Evaluator, so as to be able to call the method without keyword arguments (had the issue with issue classification since we redefined the default input_column="image")
  • Adds support for two inputs columns (as in glue/mnli) for TextClassificationEvaluator

@lvwerra I added a DatasetColumnTextClassification because of the input expected by the transformers TextClassificationPipeline: https://github.com/huggingface/transformers/blob/d0acc9537829e7d067edbb791473bbceb2ecf056/src/transformers/pipelines/text_classification.py#L109-L111 . I did not manage to use DatasetColumn for it, could we do better than what I did? It's a bit hacky here. Probably memory is not so much sensitive for text compared to images, so we could not use these wrappers altogether?

This stills misses a bit of documentation.

Partially closes https://github.com/huggingface/evaluate/issues/196

fxmarty avatar Jul 26 '22 15:07 fxmarty

The documentation is not available anymore as the PR was closed or merged.

Looks great! Thanks for adding this, I left a a few minor comments. I agree with @ola13 that splitting such a PR in two would make it much easier to review (for next time) :)

lvwerra avatar Jul 28 '22 10:07 lvwerra

Hello, thank you for your feedback! I will be careful next to break PRs in unitary changes.

fxmarty avatar Aug 01 '22 07:08 fxmarty

@lvwerra @ola13 Feel free to re-review! Sorry for the lot of code change for a minimal new feature, will break it up in the future.

fxmarty avatar Aug 01 '22 08:08 fxmarty