xtuner icon indicating copy to clipboard operation
xtuner copied to clipboard

"<image>" is absent in "llava_instruct_150k_zh.jsonl"

Open wusize opened this issue 8 months ago • 1 comments

Hi,

I noticed that llava_instruct_150k_zh.jsonl is used in the config that fine-tunes phi3-based llava using datasets from internvl. However, I found the special token <image> is missing from this jsonl file. In the current llava pipeline, image embeddings won't be inserted into the input sequence of LLM if this special token is absent.

wusize avatar Jun 15 '24 12:06 wusize