recognize-anything The size of tag_des is 51 in code, but not clearified in paper.

The size of tag_des is 51 in code, but not clearified in paper.

Open CZX-Yui opened this issue 1 year ago • 1 comments

Brilliant work~ I have a question about the detail in your code. I notice that the "LLM Tag Des" is consists of 50 sentences generated by chatGPT, which is mentioned in paper. And the "Hand-Written" prompt is "A photo of xxx". They are compared seperately. But in your code, it seems that these two prompt are concated together and each tag's embedding is (51, 512). Will this lead to a better performance?

Nov 16 '23 09:11 CZX-Yui

Hi, thanks for rising this. We added 'a photo of a {tag}' mainly to address the situation when there are no descriptions provided by LLM during inference.

Nov 17 '23 01:11 xinyu1205

recognize-anything recognize-anything copied to clipboard

The size of tag_des is 51 in code, but not clearified in paper.

recognize-anything
recognize-anything copied to clipboard