ml-cvnets icon indicating copy to clipboard operation
ml-cvnets copied to clipboard

About Dataset collate fn

Open YHYeooooong opened this issue 2 years ago • 2 comments

Hello! thanks for sharing the great work!

I have some questions about the dataset.py code. I rewrite the imagenet.py in dataset/classification to make my own dataset.

I changed only the name of def and 'register_collated_fn', 'register_dataset'. I found that my collated_fn def name and 'register_collated_fn' are not the same, and the collated name in dataset.py line 45~47 and 'register_collate_fn' are different either. I wonder if those things call the other collate_fn (not collated_fn in my code) in the training and validation phase? if so, is called collated_fn exactly the same as imagenet.py collated_fn? or working as same as imagenet.py collated_fn? May it make any performance difference by using default collate_fn instead of imagenet_collate_fn?

here is my dataset.py CBIS-DDSM_4class_sampled.zip

and my yaml file is here 0707_mobilevits_real_defualt_lr0.0001_cosine_advanced_multiscale.zip

YHYeooooong avatar Oct 27 '22 04:10 YHYeooooong

Hi,

I'm really confused about what the question is here. Please say: 1) what you did (e.g. a partial code containing what is registered) 2) what you expected 3) what went wrong.

For example, I think you mentioned wanting to use your own collate_fn, but in your Yaml I don't see collate_fn_name_train (or the same for val/test) defined. Also note that imagenet.py forcibly sets these values.

May it make any performance difference by using default collate_fn instead of imagenet_collate_fn?

imagenet_collate_fn is designed so that it can remove corrupted images if they exist. The main objective for using it wasn't for it to be efficient here AFAIK.

farzadab avatar Oct 27 '22 16:10 farzadab

Thanks for fast replying!

  1. What you did I changed the iamgenet.py code to make my dataset.py and it is attached on above

git1

I changed these parts from imagenet.py (I changed only the name. not the working code)

git2 git3 git4

In the second picture, actually, I changed this part, https://github.com/apple/ml-cvnets/blob/84d992f413e52c0468f86d23196efd9dad885e6f/data/datasets/classification/imagenet.py#L53

and changed part's collated_fn_name is not the same as the 3rd picture's register_collated_fn name.

  1. What you expected I think the difference in the collated_fn name in the 2nd and 3rd pictures may make my code can not find my collate_fn function in the 3rd picture. I think the default_collated_fn function may be used during the training and validation phase.

  2. What went wrong If the default_collated_fn or other collated_fn is used, did the code work the same as imagenet.py collated_fn? If not, can this change make a performance difference (the model using imagenet_collated_fn vs. the model using defualt_collated_fn)?

  • if I changed collated_fn function like 3rd picture, then the changed collated_fn autometically saved in register function when i run the cvent train code?

YHYeooooong avatar Oct 28 '22 00:10 YHYeooooong