AdaFace icon indicating copy to clipboard operation
AdaFace copied to clipboard

Problems using custom dataset to train Adaface

Open trnikon opened this issue 3 years ago • 6 comments

I am trying to train AdaFace with my own dataset: a custom folder ('data'), a subfolder called 'imgs' that has a collection of other folders ('folder_001', 'folder_002', etc) with various photos of the same face inside each one ('f_001.jpg', 'f_002.jpg' etc). There are in total 30000 folders with images. I am using the following script to train (a small change from 'run_ir50_ms1mv2.sh'):

python main.py \
    --data_root /mnt/disk/data \
    --train_data_path imgs \
    --prefix ir50_ms1mv2_adaface \
    --gpus 1 \
    --use_16bit \
    --arch ir_50 \
    --batch_size 256 \
    --num_workers 8 \
    --epochs 26 \
    --lr_milestones 12,20,24 \
    --lr 0.1 \
    --head adaface \
    --m 0.4 \
    --h 0.333 \
    --custom_num_class 30000 \
    --low_res_augmentation_prob 0.2 \
    --crop_augmentation_prob 0.2 \
    --photometric_augmentation_prob 0.2

The result: FileNotFoundError: [Errno 2] No such file or directory: '/mnt/disk/data/faces_emore/agedb_30/meta/sizes'

For some reason it keeps searching for validation dataset folders such as 'agedb_30', 'faces_emore' that don't exist in my project. Why are these datasets required? Do I need to set val_data_path the same as train_data_path? Am I missing some other parameter that would make this work?

I also tried to overcome this by following the README_TRAIN.md instructions closely and downloading the dataset 'faces_webface_112x112', preprocessing it with convert.py and then replacing the folder 'imgs' with my own 'imgs' folder. The result: FileNotFoundError: Found no valid file for the classes agedb_30, faces_emore. Supported extensions are: .jpg, .jpeg, .png, .ppm, .bmp, .pgm, .tif, .tiff, .webp

Finally, it is not clear to me if I have to first convert my RGB image training dataset to BGR before training AdaFace and if I can use different resolution images to train Adaface (e.g. 224x224) than the standard 112x112

trnikon avatar Sep 30 '22 08:09 trnikon

Hi trnikon. The agedb30 is one of the 5 validation sets that are used for tracking the model performance during training. If you do not need these, you should change the validation dataset and data_loader with your own dataset. For creating the agedb30, it will be created in this line of code. https://github.com/mk-minchul/AdaFace/blob/76f4ce203a9f768cf6c118c02124ddbae3c3dce9/convert.py#L89

mk-minchul avatar Oct 11 '22 14:10 mk-minchul

Thanks by your updating @mk-minchul

I got the same log: "No such file or directory: ... /agedb_30/meta/sizes", in my case I'm trying to use another folder with images (like in train folder: ) to validate the results. For example:

  -"train" folder:_
    -"1"..."62" folders:
      -images of each folder class (62)
 -"val" folder:
    -"1"..."62" folders not in train:
      -face images of each folder class (62).

In my case how can I use a different validation dataset and data_loader, taking base from my "val" folder? and do not use neither agedb_30 or .rec file.

So I hope that @trnikon or @mk-minchul maybe could tell us how you solved.

Regards

ANDRESHZ avatar Apr 24 '23 20:04 ANDRESHZ

Have you managed to solve this?

martinenkoEduard avatar Jul 27 '23 18:07 martinenkoEduard

I am trying to train AdaFace with my own dataset: a custom folder ('data'), a subfolder called 'imgs' that has a collection of other folders ('folder_001', 'folder_002', etc) with various photos of the same face inside each one ('f_001.jpg', 'f_002.jpg' etc). There are in total 30000 folders with images. I am using the following script to train (a small change from 'run_ir50_ms1mv2.sh'):

python main.py \
    --data_root /mnt/disk/data \
    --train_data_path imgs \
    --prefix ir50_ms1mv2_adaface \
    --gpus 1 \
    --use_16bit \
    --arch ir_50 \
    --batch_size 256 \
    --num_workers 8 \
    --epochs 26 \
    --lr_milestones 12,20,24 \
    --lr 0.1 \
    --head adaface \
    --m 0.4 \
    --h 0.333 \
    --custom_num_class 30000 \
    --low_res_augmentation_prob 0.2 \
    --crop_augmentation_prob 0.2 \
    --photometric_augmentation_prob 0.2

The result: FileNotFoundError: [Errno 2] No such file or directory: '/mnt/disk/data/faces_emore/agedb_30/meta/sizes'

For some reason it keeps searching for validation dataset folders such as 'agedb_30', 'faces_emore' that don't exist in my project. Why are these datasets required? Do I need to set val_data_path the same as train_data_path? Am I missing some other parameter that would make this work?

I also tried to overcome this by following the README_TRAIN.md instructions closely and downloading the dataset 'faces_webface_112x112', preprocessing it with convert.py and then replacing the folder 'imgs' with my own 'imgs' folder. The result: FileNotFoundError: Found no valid file for the classes agedb_30, faces_emore. Supported extensions are: .jpg, .jpeg, .png, .ppm, .bmp, .pgm, .tif, .tiff, .webp

Finally, it is not clear to me if I have to first convert my RGB image training dataset to BGR before training AdaFace and if I can use different resolution images to train Adaface (e.g. 224x224) than the standard 112x112

Have you solved this?

martinenkoEduard avatar Jul 27 '23 18:07 martinenkoEduard

yes, use the rec file to create the folder to test data using the code.

python convert.py --rec_path <DATASET_ROOT>/<DATASET_NAME> --make_image_files --make_validation_memfiles

and put the path <DATASET_ROOT> in the comand to train

ANDRESHZ avatar Jul 30 '23 21:07 ANDRESHZ

yes, use the rec file to create the folder to test data using the code.

python convert.py --rec_path <DATASET_ROOT>/<DATASET_NAME> --make_image_files --make_validation_memfiles

and put the path <DATASET_ROOT> in the comand to train

Hello,

I would like to use an other dataset for the validation (ylfw). Do you please have an idea, how I should structure the dataset folder to get adaface run on it?

vkouam avatar Dec 31 '23 21:12 vkouam