transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Image Classification Pipeline returns score= 1.0

Open guillaumeguy opened this issue 1 year ago • 3 comments

System Info

Vision Transformers Documentation mentions that they do support regression when num_labels == 1. However, it seems incompatible with Pipeline.

In this code, the logits are normalized into scores. However, when num_labels = 1, it effectively turns the score to 1.

https://github.com/huggingface/transformers/blob/63b204eadd9829985ba13e7e4d51f905adfc2d5e/src/transformers/pipelines/image_classification.py#L116

Who can help?

@amyeroberts @nielsr

Information

  • [X] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [X] My own task or dataset (give details below)

Reproduction

  1. Train a ViT on a regression (num_labels = 1)
  2. Use pipeline

image

Expected behavior

The model should returns predictions

guillaumeguy avatar Jan 25 '23 16:01 guillaumeguy

There is no pipeline available for regression tasks, you need to use the model directly and takes its outputs.

sgugger avatar Jan 25 '23 16:01 sgugger

Thanks @sgugger! Super fast answer!

As I found the pipelines to be very helpful, I'm sharing my solution below for folks that want to still use them.

One can just rewrite the function in the postprocess function:

def postprocess(self, model_outputs, top_k=5):
        if top_k > self.model.config.num_labels:
            top_k = self.model.config.num_labels

        if self.framework == "pt":
            pred = model_outputs.logits
        else:
            raise ValueError(f"Unsupported framework: {self.framework}")

        scores = pred.tolist()
        return scores

You can then instantiate the pipeline of the overwritten class: pipe = ImageClassificationPipeline(model=model,feature_extractor=extractor,device='cuda:0')

And run your inference:

def data():
    for path in paths:
        yield PILImage.open(path)


from tqdm import tqdm
scores = []
for out in tqdm(pipe(data())):
    scores.append(out)

guillaumeguy avatar Jan 25 '23 16:01 guillaumeguy

Yes that's why the pipeline is called classification, rather than regression. We would need an ImageRegressionPipeline for this use case ;)

NielsRogge avatar Jan 26 '23 11:01 NielsRogge

Closing this issue as it seems resolved.

NielsRogge avatar Jan 31 '23 10:01 NielsRogge