Is it possible to generate embeddings that can be queried later
Title says it pretty much. In SAM, they have the idea of encode-once, decode anytime later, which helps engineer systems around it. Can something similar be implementing in GroundingDINO? Can the encoder's embeddings be cached such that it can be decoded and matched against an input-text-prompt at runtime?
Thanks for your valuable question.
I believe it can embedding features only on Grounding DINO, which may need to modify the code now. We will try to support this in later updates. Or it will be helpful if you'd like to join us by providing PRs.
Sure, I'd like to try contributing this. Can you recommend which section of the codebase might be a good starting point to look into this?
Hey was this done ?
Do we have any updates on this by any chance, @SlongLiu ? I would be interested in working on this as well - can you possibly guide me on where one could start?