VisionLLM icon indicating copy to clipboard operation
VisionLLM copied to clipboard

An issue is found in recurrence.

Open Maycbj opened this issue 2 years ago • 1 comments

An issue is found in recurrence. Location tokens, {,... , , ... , }. It is used when tokenizer decodes, where the LLM comes out with some offset coordinates relative to a point(p+x), but the demo you showed is absolute coordinates(x1,y1,x2,y2). I think you did some post-processing to the output text, e.g. ,<p0+off0>, <p1+off1> to ,, ,,

image

Maycbj avatar Jun 07 '23 12:06 Maycbj

Please share code and model pls!

spacewalkingninja avatar Jun 07 '23 14:06 spacewalkingninja