surya icon indicating copy to clipboard operation
surya copied to clipboard

Word vs Sentence Detection - Visual Question Answering

Open DelmedigoA opened this issue 1 year ago • 1 comments

Hello everyone, I’m currently using the model, and it performs exceptionally well at detecting sentences. However, I’m wondering if it could also be adapted for word-level detection. If so, could anyone advise on what settings might need to be adjusted, or if it’s more about image preprocessing? I’m asking because many Hugging Face VQA models rely on word-level tokenization, and I’m looking to align with that approach. Thanks alot!

DelmedigoA avatar Aug 16 '24 10:08 DelmedigoA

Did you find the answer?

vakidzaci avatar Dec 04 '24 05:12 vakidzaci