DCLGAN Error when training simdcl with batch

Error when training simdcl with batch_size > 1

Open ScheiklP opened this issue 2 years ago • 4 comments

Hi, I ran the following config,

python train.py \
--dataroot datasets/a_to_b \
--name a_to_b \
--model simdcl \
--no_flip \
--lr_policy linear \
--gpu_ids 0,1,2,3 \
--direction AtoB \
--num_threads 60 \
--batch_size 4

which resulted in

  File "train.py", line 46, in <module>
    model.optimize_parameters()   # calculate loss functions, get gradients, update network weights
  File ".../DCLGAN/models/simdcl_model.py", line 163, in optimize_parameters
    self.loss_G = self.compute_G_loss()
  File ".../DCLGAN/models/simdcl_model.py", line 239, in compute_G_loss
    (self.real_A, self.fake_B, self.real_B, self.fake_A)
  File ".../DCLGAN/models/simdcl_model.py", line 276, in calculate_Sim_loss_all
    feature_realA[i] = feat_k_pool1[i]
RuntimeError: The expanded size of the tensor (256) must match the existing size (1024) at non-singleton dimension 0.  Target sizes: [256, 256].  Tensor sizes: [1024, 256]

The error does not occur with --batch_size 1, and the shape mismatch in the first nicely matches the batch_size.

Cheers, Paul

Mar 30 '22 17:03 ScheiklP

Hello Paul, Oh I didn't check the cases for batch_size >1, due to batch_size >1 will degrade the performance (not only DCLGAN/SimDCL, but fairly a large percentage of unsupervised I2I models).

To run the models: Use batch_size ==1 for SimDCL, or switch to DCLGAN for batch_size >1?

To fix this: I think we need to check the part of computing similarity loss ( especially the light networks H, netF3-F6 in code) . Could you try changing x = x.view(1, -1, self.patch_num, self.nc) to x = x.view(num_of_batch_size, -1, self.patch_num, self.nc) in line 514 of /models/networks.py?

Cheers, Junlin

Mar 31 '22 03:03 JunlinHan

Hi Junlin, oh, good to know! On CUT I observed fewer mode collapses with batch_size > 1, so I thought that might also benefit DCLGAN and SimDCL. :D The results with batch_size = 1 for SimDCL and DCLGAN batch_size = 4 look very nice so far. ;)

Sure! As soon as I find some time, I'll try! Thanks for the fast response! :)

Cheers, Paul

Mar 31 '22 07:03 ScheiklP

Hi Junlin, oh, good to know! On CUT I observed fewer mode collapses with batch_size > 1, so I thought that might also benefit DCLGAN and SimDCL. :D The results with batch_size = 1 for SimDCL and DCLGAN batch_size = 4 look very nice so far. ;)

Sure! As soon as I find some time, I'll try! Thanks for the fast response! :)

Cheers, Paul

Thanks Paul! Ye I guess so. As larger batch size should provide more stable gradients, jumping out of local minima (where mode collapse usually happens).

Thanking you again for shareing valuable information!

Mar 31 '22 09:03 JunlinHan

And thank you for sharing the code! :)

Mar 31 '22 10:03 ScheiklP

DCLGAN DCLGAN copied to clipboard

Error when training simdcl with batch_size > 1

DCLGAN
DCLGAN copied to clipboard