DCLGAN icon indicating copy to clipboard operation
DCLGAN copied to clipboard

Error when training simdcl with batch_size > 1

Open ScheiklP opened this issue 2 years ago • 4 comments

Hi, I ran the following config,

python train.py \
--dataroot datasets/a_to_b \
--name a_to_b \
--model simdcl \
--no_flip \
--lr_policy linear \
--gpu_ids 0,1,2,3 \
--direction AtoB \
--num_threads 60 \
--batch_size 4

which resulted in

  File "train.py", line 46, in <module>
    model.optimize_parameters()   # calculate loss functions, get gradients, update network weights
  File ".../DCLGAN/models/simdcl_model.py", line 163, in optimize_parameters
    self.loss_G = self.compute_G_loss()
  File ".../DCLGAN/models/simdcl_model.py", line 239, in compute_G_loss
    (self.real_A, self.fake_B, self.real_B, self.fake_A)
  File ".../DCLGAN/models/simdcl_model.py", line 276, in calculate_Sim_loss_all
    feature_realA[i] = feat_k_pool1[i]
RuntimeError: The expanded size of the tensor (256) must match the existing size (1024) at non-singleton dimension 0.  Target sizes: [256, 256].  Tensor sizes: [1024, 256]

The error does not occur with --batch_size 1, and the shape mismatch in the first nicely matches the batch_size.

Cheers, Paul

ScheiklP avatar Mar 30 '22 17:03 ScheiklP

Hello Paul, Oh I didn't check the cases for batch_size >1, due to batch_size >1 will degrade the performance (not only DCLGAN/SimDCL, but fairly a large percentage of unsupervised I2I models).

To run the models: Use batch_size ==1 for SimDCL, or switch to DCLGAN for batch_size >1?

To fix this: I think we need to check the part of computing similarity loss ( especially the light networks H, netF3-F6 in code) . Could you try changing x = x.view(1, -1, self.patch_num, self.nc) to x = x.view(num_of_batch_size, -1, self.patch_num, self.nc) in line 514 of /models/networks.py?

Cheers, Junlin

JunlinHan avatar Mar 31 '22 03:03 JunlinHan

Hi Junlin, oh, good to know! On CUT I observed fewer mode collapses with batch_size > 1, so I thought that might also benefit DCLGAN and SimDCL. :D The results with batch_size = 1 for SimDCL and DCLGAN batch_size = 4 look very nice so far. ;)

Sure! As soon as I find some time, I'll try! Thanks for the fast response! :)

Cheers, Paul

ScheiklP avatar Mar 31 '22 07:03 ScheiklP

Hi Junlin, oh, good to know! On CUT I observed fewer mode collapses with batch_size > 1, so I thought that might also benefit DCLGAN and SimDCL. :D The results with batch_size = 1 for SimDCL and DCLGAN batch_size = 4 look very nice so far. ;)

Sure! As soon as I find some time, I'll try! Thanks for the fast response! :)

Cheers, Paul

Thanks Paul! Ye I guess so. As larger batch size should provide more stable gradients, jumping out of local minima (where mode collapse usually happens).

Thanking you again for shareing valuable information!

JunlinHan avatar Mar 31 '22 09:03 JunlinHan

And thank you for sharing the code! :)

ScheiklP avatar Mar 31 '22 10:03 ScheiklP