In 'Finetuninng on RVLCDIP', which one is the dataset ?
Finetuninng on RVLCDIP
Download RVLCDIP first and change the path For OCR, you might need to customize your code
bash scripts/finetune_rvlcdip.sh # Finetuning on RVLCDIP
Q1. which Dataset?
ocr_dir = os.path.join(data_args.data_dir, data_args.mpdfs_dir, 'cdip-images-full-clean-ocr021121')
image_dir = os.path.join(data_args.data_dir, data_args.mpdfs_dir, 'cdip-images')
label_dir = os.path.join(data_args.data_dir, data_args.rvlcdip_dir, 'labels')
and in run_rvlcdip.py the dir 'cdip-images-full-clean-ocr021121' is not found in the datasets below.
https://paperswithcode.com/dataset/rvl-cdip
Q2. Which OCR? I have downloaded the raw rvl_cdip dataset, in order to get a cdip-images-full-clean-ocr021121 to get a performance matching the one paper listed, which OCR should I use? Is it https://learn.microsoft.com/en-us/rest/api/computervision/3.1/get-read-result/get-read-result?tabs=HTTP ?
Q3. Is it OK for rvl_cdip being used for both pretrain and finetune?
Thank you!
I got this issues, too. Some datasets are not provided.