LAVIS
LAVIS copied to clipboard
Discrn dataset of x-instructblip
Thank you very much for your work! Do you have any plan to release the original Discrn dataset that is used to evaluate the x-instructblip, rather than the code?
Hey, man. How did you install x-instructblip? Do you simply follow the readme? I had issues with Ninja and CUDA. Did you have the same problem? Can you help?
Thank you for your interest in the model. You can download the DisCRn dataset as follows:
from lavis.datasets.builders import load_dataset ds = load_dataset('image_pc_discrn') # image-point cloud data ds = load_dataset('audio_video_discrn') # audio-video data
Note that for the 3D pairs, we were required to remove 628 out of 28173 datapoints from the release, due to being associated with by-sa licensed point clouds. However, it should not skew the results significantly.
To evaluate X-InstructBLIP on DisCRN:
python -m torch.distributed.run --nproc_per_node=8 train.py --cfg-path lavis/projects/xinstruct_blip/eval/discrn/audio_video_describe.yaml
python -m torch.distributed.run --nproc_per_node=8 train.py --cfg-path lavis/projects/xinstruct_blip/eval/discrn/image_3d_describe.yaml
Make sure to update the Audiocaps audio and corresponding youtube-videos (using a tool like youtube-dl) path with your local install. Same for Objaverse point-clouds (here) and Cap3D rendered images (here)
About the installation I responded to a different thread, so we can try to debug the issue.