LAVIS sbu caption dataset format

sub.json is organized in the format: [{'image': '4385058960_b0f291553e.jpg', 'caption': 'a wooden chair in the living room', 'url': 'http://static.flickr.com/2723/4385058960_b0f291553e.jpg'}, ...}

but the downloaded sbu_images.rar is extracted as: 0000/ 0001/ 0002/ 0003/ ... 0999/ in each directory contains 1000 images named in order: 000.jpg 001.jpg 002.jpg ... 999.jpg

Therefore, the image storage path does not correspond to the path in json. @dxli94

Nov 13 '22 05:11 1024er

Hi, @1024er,

Thanks for raising this. This definitely needs fixing. I'll work on this this week.

Thanks.

Nov 14 '22 03:11 dxli94

Hi, @1024er,

Thanks for raising this. This definitely needs fixing. I'll work on this this week.

Thanks.

Has it been fixed ? thank you ~

Nov 19 '22 10:11 1024er

Hi @1024er ,

It seems the annotations of SBU captions are not properly addressing the image directory structure in the zip.

I have now updated the downloading script to directly fetch images from urls. Though I wouldn't be surprised if some urls deprecate as they will.

Let me know how it works.

Thanks.

Nov 21 '22 08:11 dxli94

Hi, I tried the new annotation file, but I still found a lot of images were missing. I am wondering if there is a script to generate an annotations file based on available images.

Jan 06 '23 22:01 xinbowu2

Hi, I tried the new annotation file, but I still found a lot of images were missing. I am wondering if there is a script to generate an annotations file based on available images.

I've encountered the same issue. Would you mind providing the processed images via google drive? @dxli94

Jan 08 '23 07:01 slyviacassell

Hi, I tried the new annotation file, but I still found a lot of images were missing. I am wondering if there is a script to generate an annotations file based on available images.

I've encountered the same issue. Would you mind providing the processed images via google drive? @dxli94

I'll give a processing script for masking the non-valid records of sbu captions.

import tqdm
import os
nonvalid_records=[]
valid_records=[]
with open('sbu_captions/annotations/sbu.json', "r") as f:
    dset=json.load(f)
    def check_file_exists(filename,path):
        exist=os.path.exists(os.path.join(path,filename))
        return exist
    
    for ann in tqdm.tqdm(dset):
        exist=check_file_exists(ann['image'],'sbu_captions/images')
        
        if exist:
            valid_records.append(ann)
        else:
            nonvalid_records.append(ann)
    print('not valid records',len(nonvalid_records),'valid records',len(valid_records))

print("saving valid")
with open('sbu_captions/annotations/sbu_valid.json', "w") as f:
    dset=json.dump(valid_records,f)

print("saving nonvalid")
with open('sbu_captions/annotations/sbu_nonvalid.json', "w") as f:
    dset=json.dump(nonvalid_records,f)

Jan 10 '23 05:01 slyviacassell

LAVIS LAVIS copied to clipboard

sbu caption dataset format

LAVIS
LAVIS copied to clipboard