LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

Discrn dataset of x-instructblip

Open ADu2021 opened this issue 1 year ago • 2 comments

Thank you very much for your work! Do you have any plan to release the original Discrn dataset that is used to evaluate the x-instructblip, rather than the code?

ADu2021 avatar Jan 06 '24 02:01 ADu2021

Hey, man. How did you install x-instructblip? Do you simply follow the readme? I had issues with Ninja and CUDA. Did you have the same problem? Can you help?

giuliannocappellari avatar Feb 08 '24 16:02 giuliannocappellari

Thank you for your interest in the model. You can download the DisCRn dataset as follows:

from lavis.datasets.builders import load_dataset ds = load_dataset('image_pc_discrn') # image-point cloud data ds = load_dataset('audio_video_discrn') # audio-video data

Note that for the 3D pairs, we were required to remove 628 out of 28173 datapoints from the release, due to being associated with by-sa licensed point clouds. However, it should not skew the results significantly. To evaluate X-InstructBLIP on DisCRN: python -m torch.distributed.run --nproc_per_node=8 train.py --cfg-path lavis/projects/xinstruct_blip/eval/discrn/audio_video_describe.yaml python -m torch.distributed.run --nproc_per_node=8 train.py --cfg-path lavis/projects/xinstruct_blip/eval/discrn/image_3d_describe.yaml

Make sure to update the Audiocaps audio and corresponding youtube-videos (using a tool like youtube-dl) path with your local install. Same for Objaverse point-clouds (here) and Cap3D rendered images (here)

About the installation I responded to a different thread, so we can try to debug the issue.

artemisp avatar Feb 23 '24 22:02 artemisp