unilm icon indicating copy to clipboard operation
unilm copied to clipboard

[kosmos-2] Is there a plan to release code for more tasks of kosmos-2?

Open BIGBALLON opened this issue 2 years ago • 3 comments
trafficstars

Thanks for the amazing work [kosmos-2]. From the paper, kosmos-2 can do lots of tasks: (1) visual grounding, (2)-(3) grounded question answering, (4)-(6) multimodal referring via bounding boxes, and (7) grounded image captioning.

Is there any plan to release a demo for these tasks?

BIGBALLON avatar Jul 04 '23 05:07 BIGBALLON

Thanks! Those all are in our release schedule. If you can't wait to experience it, please refer to here.

pengzhiliang avatar Jul 04 '23 07:07 pengzhiliang

@pengzhiliang thanks a lot !!!

BIGBALLON avatar Jul 04 '23 08:07 BIGBALLON

We have released evaluation code of some tasks in here.

pengzhiliang avatar Jul 12 '23 08:07 pengzhiliang

We have released evaluation code of some tasks in here.

Thanks for your hard work. Could you continue to release the remaining evaluation code? It would be a valuable contribution to the field.

JierunChen avatar Oct 11 '23 06:10 JierunChen

We have released evaluation code of some tasks in here.

Thanks for your hard work. Could you continue to release the remaining evaluation code? It would be a valuable contribution to the field.

@JierunChen Which tasks you would like to evaluate first? So we could prioritize the release pipeline.

@pengzhiliang FYI.

donglixp avatar Oct 11 '23 12:10 donglixp

@donglixp @pengzhiliang I do not have a strict preference and you may prioritize at your best convenience. The evaluation code is requested for the following tasks:

  1. Referring expression generation
  2. Image captioning
  3. Visual question answering

Grateful for your contributions.

JierunChen avatar Oct 18 '23 06:10 JierunChen