VLMEvalKit icon indicating copy to clipboard operation
VLMEvalKit copied to clipboard

Turn Off Saving Images in LMUData

Open moured opened this issue 8 months ago • 1 comments

When running experiments, encoded images from the TSV are automatically saved to the LMUData folder. Is there a way to skip saving them to optimize disk space usage? Would be nice to have such a arg.

Thanks!

moured avatar Apr 28 '25 09:04 moured

Hi, currently, all images will be converted into an image file before feed into LLM. The process happens in build_prompt function in each dataset, for example:

if self.meta_only:
    tgt_path = toliststr(line['image_path'])
else:
    tgt_path = self.dump_image(line)
...
msgs = [dict(type='image', value=tgt_path)]

to pass image in the base64 format directly to LLM, you can comment the above lines and use the following code

base64_image = line["image"]
msgs = [
    {"image" : f"data:image;base64,{base64_image}",
]

To control whether or not saving images, this requires further discussion. As far as I known, base64 image format has drawbacks in terms of size, efficiency, and debuggability. Maybe @kennymckormick can give more insightful opinions.

MaoSong2022 avatar Apr 29 '25 10:04 MaoSong2022