pliers
pliers copied to clipboard
[ENH] Image preprocesssing for TFHub
Using the TFHub module, I've found some issues in the preprocessing part of the images, before injecting them in the model. This PR is a suggestion on how to fix the issues.
One issue is that models require the images to be in the [0, 255] range, using dtype e.g. tf.uint8. My proposal is to let the dtype to be parametrizable, allowing the user to not rescale & keep as uint8.
The other issue is that the specialization of the _preprocess function in the TFHubImageExtractor class supersede the transform_inp custom function. It is an interesting functionality, as it allows to work with the data already in tensors.
Please let me know your thoughts about it!
Cool, I'll try to take a look this but if I don't feel free to bug me
Cool, okay two more things:
-
@rbroc can you take a quick look
-
We need to fix tests before we can merge! I'm guessing its unrelated but lets fix in a separate PR
hey thanks for spotting this and sorry for the delay!
Re: the dtype problem. Couldn't we simply convert x to tensor before rescaling and cast to tf.float32 by default, triggering a warning ("we're converting to float") instead of an error?
Re: transform_inp. The current logic leaves this attribute to None for the image extractor and it remains unused. Not super elegant, but the idea is that in most cases the only meaningful preprocessing transformations for images will be rescaling and resizing.
If we have reasons to think other transformations are relevant, we could allow user to pass transform_inp to this extractor too. In this case though I think transform_inp should either be applied to stim.data before resizing or between resizing and rescaling.
@rbroc Thank you for reviewing it! I've also did not check what's going on with the tests, but should do soon.
The current dtype conversion does not err, it only show a warning when rescale_rgb=True and dtype=int. I agree it can convert to tensor before rescaling and cast to float32 if rescale_rgb=True and display that the dtype argument was ignored.
I am not sure about the ordering of operations. Reshaping/reordering the tensor is an use-case when the model requires a different ordering or # of dimensions.
hey @anibalsolon - thanks! Couple of considerations:
- let's allow users to pass
dtypefor sure (it also needs to be added to the docstring); - I'm a bit torn about the rescaling issue. Maybe it makes sense to display an error when rescaling is requested and
dtypeis set to an integer type (I can't imagine any scenario where you would want to do that). Alternatively, we could implement rescaling as a filter to make it more pliers-y (@adelavega)? - Bonus consideration, but along the same lines, we could also remove the resizing and let the user use the filter (this is less familiar for ML folks but more in line with pliers logic)
- re: other transformations: resizing is handled by the filter and I'd maybe lean towards not allowing any extra transformation unless we have concrete use cases (but open to changing my mind). The
transform_inpwas more meant for the generic tfhub model which is open to anyStimand it's hard to anticipate what transformations are expected (while for image and text there's standard sets of expected preprocessing steps). Yet - I agree that it's not elegant thattransform_inpis inherited but not used by image and text classes... @adelavega any strong opinion on this?
@anibalsolon I made a PR to your PR (so meta).
LMK what you think. I think making the data type parameterize is still fine to do within this function bc its such a domain specific transformation, but more general ones should occur outside the extractor if such a Filter exists (and for RGB to unit scale, lets make one!)
It may bother a bit the user to keep showing that warning about shapes, but it should be fine. I've finally merged your PR, thank you for that!
Hey @anibalsolon
Do you want to meet about this sometime soon? I'm happy to give this a full review.
Also, can you rebase w/ master? I've fixed the tests recently
Codecov Report
Base: 75.65% // Head: 74.02% // Decreases project coverage by -1.62% :warning:
Coverage data is based on head (
47db2ee) compared to base (4237acd). Patch coverage: 100.00% of modified lines in pull request are covered.
Additional details and impacted files
@@ Coverage Diff @@
## master #459 +/- ##
==========================================
- Coverage 75.65% 74.02% -1.63%
==========================================
Files 65 60 -5
Lines 3877 4011 +134
==========================================
+ Hits 2933 2969 +36
- Misses 944 1042 +98
| Impacted Files | Coverage Δ | |
|---|---|---|
| pliers/extractors/models.py | 93.84% <100.00%> (-0.10%) |
:arrow_down: |
| pliers/filters/image.py | 98.61% <100.00%> (+0.22%) |
:arrow_up: |
| pliers/transformers/api/google.py | 38.88% <0.00%> (-2.29%) |
:arrow_down: |
| pliers/converters/api/wit.py | 45.65% <0.00%> (-2.27%) |
:arrow_down: |
| pliers/support/decorators.py | 80.00% <0.00%> (-1.82%) |
:arrow_down: |
| pliers/extractors/api/clarifai.py | 21.29% <0.00%> (-1.72%) |
:arrow_down: |
| pliers/transformers/api/microsoft.py | 25.00% <0.00%> (-1.16%) |
:arrow_down: |
| pliers/stimuli/api.py | 37.73% <0.00%> (-1.16%) |
:arrow_down: |
| pliers/converters/api/revai.py | 25.75% <0.00%> (-1.11%) |
:arrow_down: |
| pliers/converters/api/ibm.py | 24.71% <0.00%> (-0.84%) |
:arrow_down: |
| ... and 20 more |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
Tests are passing! I agree with @anibalsolon that the warning might be excessive to have every time the object is instantient.
Perhaps we can catch a specific type of exception or just document this well in the docstring instead?
Will think about this, make the change and merge
Test failures are unrelated, so merging