transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Image transforms library

Open amyeroberts opened this issue 3 years ago • 1 comments

What does this PR do?

This is the first of a series of PRs to replace feature extractors with image processors for vision models.

Create a new module image_transforms.py that will contain functions for transforming images e.g. resize.

The functions are designed to:

  • Accept numpy arrays.
  • Return numpy arrays (except for e.g. to_pil_image)
  • Provide logic such that the new image processors produce the same outputs as feature extractors when called directly.

Subsequent PRs:

  • Image Processor Mixin: https://github.com/amyeroberts/transformers/pull/25
  • GLPNImageProcessor: https://github.com/amyeroberts/transformers/pull/23
  • GLPNFeatureExtractor -> GLPNImageProcessor alias https://github.com/amyeroberts/transformers/pull/24

Fixes # (issue)

Before submitting

  • [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [ ] Did you read the contributor guideline, Pull Request section?
  • [ ] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
  • [ ] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
  • [ ] Did you write any new necessary tests?

amyeroberts avatar Aug 08 '22 11:08 amyeroberts

@sgugger @NielsRogge @alaradirik @LysandreJik Adding you all for a first-pass review for the draft ImageProcessor work. This PR is failing because it's not safely importing e.g. PIL if it's not available, but the core logic shouldn't change. I'll add you to the follow up PRs too. Note: ImageProcessor has only been implemented for the GLPN model so far.

amyeroberts avatar Aug 08 '22 14:08 amyeroberts

The documentation is not available anymore as the PR was closed or merged.

@alaradirik @sgugger I've now merged in the stacked PRs above this one. This PR has the transforms library and the image processor for GLPN. Thanks for all of you reviews so far! This should be ready for a final review to make sure all the pieces work together before merging.

amyeroberts avatar Aug 18 '22 10:08 amyeroberts

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Sep 27 '22 15:09 github-actions[bot]

@alaradirik @NielsRogge Could you (re-)review?

amyeroberts avatar Sep 28 '22 09:09 amyeroberts

I just have a question regarding multi-modal models such as CLIP and OWL-ViT. These models have both feature extractors and processors, which call their respective tokenizer and feature extractor. Wouldn't creating XXModelProcessor aliases for their feature extractors create issues?

@alaradirik I believe this should be OK, as the feature extractors are being mapped to XxxImageProcessor rather than XxxProcessor, so there's no clash of names. Not sure if this answers your question or I've missed the consequence you're asking about.

amyeroberts avatar Oct 10 '22 17:10 amyeroberts