instructor_ex icon indicating copy to clipboard operation
instructor_ex copied to clipboard

Integration with local LLaVA, do we want it? What is the way?

Open thbar opened this issue 10 months ago • 3 comments

I've been looking for local-only solutions to reliably extract structured data from invoices/receipts. No API/cloud solutions for obvious privacy reasons (those receipts can sometimes include credentials or account identifiers, and I don't want that data to leave the server in that case).

Thanks to a tweet, I came across this apparently very nice solution:

https://github.com/haotian-liu/LLaVA

A first test via their demo page with a real restaurant receipt worked very nicely (just as nicely as GPT4 currently), see https://twitter.com/thibaut_barrere/status/1773031570259001720 for the input and output.

I see (in #36) that other people are interested to extract data from images.

BumbleBee is also a possibility with a proper choice of model, of course.

Would this have its place in instructor_ex to your opinion?

If yes, is there a recommended path to integrate a new model? (unsure I'll tackle this, but at least interested to discuss that).

thbar avatar Mar 27 '24 17:03 thbar