VisionLLM issues

Code and integration with interGPT

1

May I know, when will you release your code or the full details of your paper?

[REQUEST] Code and models please!

19

Hello! I am urgently asking for the release of the inference code + model. Training would be good too. Incredibly thankful, very interesting project!

spacewalkingninja

Questions about location tokens

4

Hi, your work is great! But I am confused about the location tokens you used in Decoder, could you provide more details it?

Deephome

About object detection

1

I think that you push below token in llm ``` ['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ...] ``` about object detection...

chagmgang

Are segmentation outputs (coordinates) directly predicted from network as floating point numbers under next token prediction loss? This part is quite unclear in the paper. Or are they regressed (using...

kahnchana

About the training time

What is the training time of the whole model?

blue-blue272

An issue is found in recurrence.

1

An issue is found in recurrence. Location tokens, {,... , , ... , }. It is used when tokenizer decodes, where the LLM comes out with some offset coordinates relative...

Maycbj

Question about the ablation

2

Thanks for your awesome work! VisionLLM opens a way towards a generalist vision and language model. However, from the result in the single task vs. multiple tasks in ablation study,...

Richar-Du

VisionLLM v2 checkpoint

5

Hello, Thanks for your wonderful work VisionLLM v2 and I'm so interested in your paper. I wonder when will the model checkpoint be released. It will be so grateful if...

KangsanKim07

Vision and vision-language tasks list

1

Thanks for your wonderful work! May I ask for a detailed list of hundreds of public vision and vision-language tasks mentioned in the v2 paper?

aassxun

VisionLLM
VisionLLM copied to clipboard

Metadata

Code and integration with interGPT

[REQUEST] Code and models please!

Questions about location tokens

About object detection

About segmentation outputs

About the training time

An issue is found in recurrence.

Question about the ablation

VisionLLM v2 checkpoint

Vision and vision-language tasks list

← Metadata

Owner

Metadata

VisionLLM VisionLLM copied to clipboard

Metadata

← Metadata

Owner

Metadata

VisionLLM
VisionLLM copied to clipboard