unilm 【kosmos-2】The code for GRIT construction

【kosmos-2】The code for GRIT construction

Open ZJUTSong opened this issue 1 year ago • 4 comments

Describe Model I am using kosmos-2: Will you updata the code of GRIT construction process? I'd like to finetune kosmos-2 in App UI scene, but the detail of GRIT construction is not clear enough for me. For example, the steps of "get noun chunks and region from detector" and "input image and noun chunks into glip to obtain bboxes" seems same? Thaks for your great work!

Sep 01 '23 03:09 ZJUTSong

Sep 01 '23 03:09 donglixp

Oh，sorry！ I made a mistake. Another question: The process of generate grit is strict? In a specific scene, GLIP might can not recognize all objects. In this case, is it possible to generate object bbox、captions and nuon-chunks manually for fientuning?

Sep 01 '23 05:09 ZJUTSong

Yes, manual annotations would be quite helpful.

Sep 01 '23 06:09 donglixp

Hi! I am also curious about the construction of the GRIT dataset. It is mentioned in the paper that

We eliminate certain abstract noun phrases that are challenging to recognize in the image, such as “time”, “love”, and “freedom”, to reduce potential noise.

So, the abstract noun phrases are eliminated manually or using spacy? Many thanks!

Jan 04 '24 08:01 davidluciolu

unilm unilm copied to clipboard

【kosmos-2】The code for GRIT construction

unilm
unilm copied to clipboard