ijepa icon indicating copy to clipboard operation
ijepa copied to clipboard

Image resolution & folder structure for unsupervised pre-training

Open rringham opened this issue 8 months ago • 1 comments

Am exploring I-JEPA, wanted to make sure I understood what it's expecting in terms of the structure of image_folder - e.g., here's my config:  

data:
  batch_size: 128
  color_jitter_strength: 0.0
  crop_scale:
  - 0.3
  - 1.0
  crop_size: 224
  image_folder: /data_home/datasets/custom_dataset/unlabeled
  num_workers: 10
  pin_mem: true
  root_path: /data_home/datasets/custom_dataset/unlabeled
  use_color_distortion: false
  use_gaussian_blur: false
  use_horizontal_flip: false

Imagine my image_folder is structured like this - where each batch is a folder containing several thousand unlabeled images:

/data_home/datasets/custom_dataset/unlabeled/batch_001
/data_home/datasets/custom_dataset/unlabeled/batch_002

Is structuring my dataset like that an incorrect way of pretraining? E.g., will I-JEPA be incorrectly influenced by the "grouping" of images in each batch folder (even though each folder contains randomly assembled unlabeled images)?

Additionally, for pre-training I-JEPA on a new dataset composed of unlabeled data, what resolution should those unlabeled images be?

Thank you!

rringham avatar Dec 11 '23 22:12 rringham