ml-aim icon indicating copy to clipboard operation
ml-aim copied to clipboard

native resolution support on larger size

Open MonolithFoundation opened this issue 1 year ago • 4 comments

Hi, would consider opensource larger native resolution model especially the 1B one?

MonolithFoundation avatar Nov 27 '24 09:11 MonolithFoundation

Thank you for your interest! At the moment we do not have plans to release larger native resolution models. However, we appreciate your feedback and will keep this in mind.

DonkeyShot21 avatar Nov 28 '24 13:11 DonkeyShot21

Hello, I am likewise extremely interested in large native models (even those of a size as extensive as 1B).

Currently, vision encoders (VEs) of fixed dimensions are not particularly efficacious when it comes to comprehending large resolutions, such as in document understanding and other related domains. (we can only employ interpolation which significantly sacrifices accuracy).

I hope that your team contemplates open-sourcing large native models to confer more advantages on the community.

lucasjinreal avatar Nov 30 '24 01:11 lucasjinreal

Thanks for your feedback, @MonolithFoundation and @lucasjinreal! We will consider adding native resolution support for the higher capacity models in our short/mid-term plans and will keep you updated.

aelnouby avatar Dec 02 '24 12:12 aelnouby

Thank u so much for the consideration!

MonolithFoundation avatar Dec 03 '24 02:12 MonolithFoundation