ijepa icon indicating copy to clipboard operation
ijepa copied to clipboard

Image resolution & folder structure for unsupervised pre-training

Open rringham opened this issue 1 year ago • 1 comments

Am exploring I-JEPA, wanted to make sure I understood what it's expecting in terms of the structure of image_folder - e.g., here's my config:  

data:
  batch_size: 128
  color_jitter_strength: 0.0
  crop_scale:
  - 0.3
  - 1.0
  crop_size: 224
  image_folder: /data_home/datasets/custom_dataset/unlabeled
  num_workers: 10
  pin_mem: true
  root_path: /data_home/datasets/custom_dataset/unlabeled
  use_color_distortion: false
  use_gaussian_blur: false
  use_horizontal_flip: false

Imagine my image_folder is structured like this - where each batch is a folder containing several thousand unlabeled images:

/data_home/datasets/custom_dataset/unlabeled/batch_001
/data_home/datasets/custom_dataset/unlabeled/batch_002

Is structuring my dataset like that an incorrect way of pretraining? E.g., will I-JEPA be incorrectly influenced by the "grouping" of images in each batch folder (even though each folder contains randomly assembled unlabeled images)?

Additionally, for pre-training I-JEPA on a new dataset composed of unlabeled data, what resolution should those unlabeled images be?

Thank you!

rringham avatar Dec 11 '23 22:12 rringham

I guess, it doesn't impact the way as in the dataset section they are not utilizing any labels to do the training. But you need to edit the scripts according to your use-case. Every image is resized to 224 x 224. If your base images are in good quality, resizing has a less impact as fidelity of the images is good.

VimukthiRandika1997 avatar Mar 20 '24 15:03 VimukthiRandika1997