open_flamingo
open_flamingo copied to clipboard
Create extra samples with surplus images
Addresses issue https://github.com/mlfoundations/open_flamingo/issues/231
- chunk and yield a sample every
max_num_images
valid images - refactor
preprocess_gpt_interleaved
andpreprocess_interleaved
Few questions before I make more changes:
- What is a good way to test this?
- I've noticed some subtle differences between the image/text processing steps between
preprocess_gpt_interleaved
andpreprocess_interleaved
, just wondering if these could be consolidated. e.g.- the gpt one pads with shapes (3, 224, 224) while mmc4 depends on the image size.
- mmc4 has 50% chance of keeping single image samples while gpt does not.
- mmc4 avoid the situation where there's one
token and it's at the end while gpt does not.
Yeah I think we can default to the mmc4 code for all of these.
For the first point this should be based on the image size. I think because most vision encoders have size 224x224 we just defaulted to that but that isn't the right way to do it.
@anas-awadalla here's a first complete draft. please let me know what you think. Separately I think a pre-commit hook would benefit development. Could raise this in a separate PR if needed.
Great will check it tomorrow
What would the hook contain? The code formatting?
What would the hook contain? The code formatting?
Yep, kind of like this or whatever else we want.
This would be awesome!
@anas-awadalla hoping to get your feedback on this when you get a chance next 🙏
Sorry @isaac-chung got busy with a paper I am pushing. I will definitely review and merge this week tho!
No worries, good luck with the paper!
@anas-awadalla a gentle nudge to bubble this back up in your inbox. Hoping to close this soon 🙏