CaCao
CaCao copied to clipboard
This is the official repository for the paper "Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World" (Accepted by ICCV 2023)
i use the VG-SGG-base-EXPANDED-with-attri.h5 to reproduce the motifs+cacao, but the r is 50 and mr is 18, can you provide the dataset code?
https://github.com/Yuqifan1117/CaCao/issues/18#issuecomment-1851747094 I have a question about this statement: after encoding images and texts using clip, the shapes are [B,50,768] and [B,77,512] respectively. How can I set their seq_length to 4...
Again, Thank you for sharing the code of this awesome work. I would like to confirm the implementation of Epic, Entangled cross-modal prompt approach for open-world predicate scene graph generation...