ijepa
ijepa copied to clipboard
Image resolution & folder structure for unsupervised pre-training
Am exploring I-JEPA, wanted to make sure I understood what it's expecting in terms of the structure of image_folder
- e.g., here's my config:
data:
batch_size: 128
color_jitter_strength: 0.0
crop_scale:
- 0.3
- 1.0
crop_size: 224
image_folder: /data_home/datasets/custom_dataset/unlabeled
num_workers: 10
pin_mem: true
root_path: /data_home/datasets/custom_dataset/unlabeled
use_color_distortion: false
use_gaussian_blur: false
use_horizontal_flip: false
Imagine my image_folder
is structured like this - where each batch is a folder containing several thousand unlabeled images:
/data_home/datasets/custom_dataset/unlabeled/batch_001
/data_home/datasets/custom_dataset/unlabeled/batch_002
Is structuring my dataset like that an incorrect way of pretraining? E.g., will I-JEPA be incorrectly influenced by the "grouping" of images in each batch folder (even though each folder contains randomly assembled unlabeled images)?
Additionally, for pre-training I-JEPA on a new dataset composed of unlabeled data, what resolution should those unlabeled images be?
Thank you!
I guess, it doesn't impact the way as in the dataset section they are not utilizing any labels to do the training. But you need to edit the scripts according to your use-case. Every image is resized to 224 x 224. If your base images are in good quality, resizing has a less impact as fidelity of the images is good.