transformers icon indicating copy to clipboard operation
transformers copied to clipboard

VLM: add more modularity

Open zucchini-nlp opened this issue 4 months ago • 1 comments

What does this PR do?

As mentioned in https://github.com/huggingface/transformers/issues/33948, this PR simply refactors code a bit to make it more modular, Specifically we now will have special public methods for obtaining image/video features that users can easily overwrite if they want to modify the process. In any way this makes less code in forward and more standardization in API

zucchini-nlp avatar Oct 15 '24 14:10 zucchini-nlp