sgan
sgan copied to clipboard
Training loss does not change, and validation FDE error is super high
I am trying to train Social-GAN with the code in the repository, but it looks like the G loss and D loss never change after 1.386 and 0.693.
Also, the validation FDE error is 11.058
Am I doing the training correctly?
$ PYTHONPATH=. python scripts/train.py --restore_from_checkpoint 0
[INFO: train.py: 118]: Initializing train dataset
[INFO: train.py: 120]: Train dataset size: 2692
[INFO: train.py: 121]: Initializing val dataset
[INFO: train.py: 129]: There are 21 iterations per epoch
[INFO: train.py: 153]: Here is the generator:
[INFO: train.py: 154]: TrajectoryGenerator(
(encoder): Encoder(
(encoder): LSTM(64, 64)
(spatial_embedding): Linear(in_features=2, out_features=64, bias=True)
)
(decoder): Decoder(
(decoder): LSTM(64, 128)
(pool_net): PoolHiddenNet(
(spatial_embedding): Linear(in_features=2, out_features=64, bias=True)
(mlp_pre_pool): Sequential(
(0): Linear(in_features=192, out_features=512, bias=True)
(1): ReLU()
(2): Linear(in_features=512, out_features=1024, bias=True)
(3): ReLU()
)
)
(mlp): Sequential(
(0): Linear(in_features=1152, out_features=1024, bias=True)
(1): ReLU()
(2): Linear(in_features=1024, out_features=128, bias=True)
(3): ReLU()
)
(spatial_embedding): Linear(in_features=2, out_features=64, bias=True)
(hidden2pos): Linear(in_features=128, out_features=2, bias=True)
)
(pool_net): PoolHiddenNet(
(spatial_embedding): Linear(in_features=2, out_features=64, bias=True)
(mlp_pre_pool): Sequential(
(0): Linear(in_features=128, out_features=512, bias=True)
(1): ReLU()
(2): Linear(in_features=512, out_features=1024, bias=True)
(3): ReLU()
)
)
(mlp_decoder_context): Sequential(
(0): Linear(in_features=1088, out_features=1024, bias=True)
(1): ReLU()
(2): Linear(in_features=1024, out_features=128, bias=True)
(3): ReLU()
)
)
[INFO: train.py: 169]: Here is the discriminator:
[INFO: train.py: 170]: TrajectoryDiscriminator(
(encoder): Encoder(
(encoder): LSTM(64, 64)
(spatial_embedding): Linear(in_features=2, out_features=64, bias=True)
)
(real_classifier): Sequential(
(0): Linear(in_features=64, out_features=1024, bias=True)
(1): ReLU()
(2): Linear(in_features=1024, out_features=1, bias=True)
(3): ReLU()
)
)
[INFO: train.py: 233]: Starting epoch 1
[INFO: train.py: 278]: t = 1 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.326
[INFO: train.py: 280]: [D] D_total_loss: 1.326
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 6 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.098
[INFO: train.py: 280]: [D] D_total_loss: 1.098
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 11 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 0.994
[INFO: train.py: 280]: [D] D_total_loss: 0.994
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.687
[INFO: train.py: 283]: [G] G_total_loss: 0.687
[INFO: train.py: 233]: Starting epoch 2
[INFO: train.py: 278]: t = 16 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.097
[INFO: train.py: 280]: [D] D_total_loss: 1.097
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.462
[INFO: train.py: 283]: [G] G_total_loss: 0.462
[INFO: train.py: 278]: t = 21 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.669
[INFO: train.py: 280]: [D] D_total_loss: 1.669
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.534
[INFO: train.py: 283]: [G] G_total_loss: 0.534
[INFO: train.py: 278]: t = 26 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 233]: Starting epoch 3
[INFO: train.py: 278]: t = 31 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 36 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 41 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 233]: Starting epoch 4
[INFO: train.py: 278]: t = 46 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 51 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 56 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 233]: Starting epoch 5
[INFO: train.py: 278]: t = 61 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 66 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 233]: Starting epoch 6
[INFO: train.py: 278]: t = 71 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 76 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 81 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 233]: Starting epoch 7
[INFO: train.py: 278]: t = 86 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 91 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 96 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 233]: Starting epoch 8
[INFO: train.py: 278]: t = 101 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 294]: Checking stats on val ...
[INFO: train.py: 298]: Checking stats on train ...
[INFO: train.py: 305]: [val] ade: 7.662
[INFO: train.py: 305]: [val] ade_l: 16.871
[INFO: train.py: 305]: [val] ade_nl: 14.038
[INFO: train.py: 305]: [val] d_loss: 1.386
[INFO: train.py: 305]: [val] fde: 11.058
[INFO: train.py: 305]: [val] fde_l: 24.348
[INFO: train.py: 305]: [val] fde_nl: 20.260
[INFO: train.py: 305]: [val] g_l2_loss_abs: 21.739
[INFO: train.py: 305]: [val] g_l2_loss_rel: 21.739
[INFO: train.py: 308]: [train] ade: 7.913
[INFO: train.py: 308]: [train] ade_l: 16.727
[INFO: train.py: 308]: [train] ade_nl: 15.018
[INFO: train.py: 308]: [train] d_loss: 1.386
[INFO: train.py: 308]: [train] fde: 11.870
[INFO: train.py: 308]: [train] fde_l: 25.090
[INFO: train.py: 308]: [train] fde_nl: 22.527
[INFO: train.py: 308]: [train] g_l2_loss_abs: 22.713
[INFO: train.py: 308]: [train] g_l2_loss_rel: 22.713
[INFO: train.py: 315]: New low for avg_disp_error
[INFO: train.py: 321]: New low for avg_disp_error_nl
[INFO: train.py: 335]: Saving checkpoint to /home/hcui2/clones/sgan/checkpoint_with_model.pt
[INFO: train.py: 337]: Done.
[INFO: train.py: 343]: Saving checkpoint to /home/hcui2/clones/sgan/checkpoint_no_model.pt
[INFO: train.py: 354]: Done.
[INFO: train.py: 278]: t = 106 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 111 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 233]: Starting epoch 9
[INFO: train.py: 278]: t = 116 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 121 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 126 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 233]: Starting epoch 10
[INFO: train.py: 278]: t = 131 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 136 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 233]: Starting epoch 11
[INFO: train.py: 278]: t = 141 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 146 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 151 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 233]: Starting epoch 12
[INFO: train.py: 278]: t = 156 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 161 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 166 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 233]: Starting epoch 13
[INFO: train.py: 278]: t = 171 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 176 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 181 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 233]: Starting epoch 14
[INFO: train.py: 278]: t = 186 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 191 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 196 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 233]: Starting epoch 15
[INFO: train.py: 278]: t = 201 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 294]: Checking stats on val ...
[INFO: train.py: 298]: Checking stats on train ...
[INFO: train.py: 305]: [val] ade: 7.662
[INFO: train.py: 305]: [val] ade_l: 16.870
[INFO: train.py: 305]: [val] ade_nl: 14.037
[INFO: train.py: 305]: [val] d_loss: 1.386
[INFO: train.py: 305]: [val] fde: 11.058
[INFO: train.py: 305]: [val] fde_l: 24.348
[INFO: train.py: 305]: [val] fde_nl: 20.260
[INFO: train.py: 305]: [val] g_l2_loss_abs: 21.739
[INFO: train.py: 305]: [val] g_l2_loss_rel: 21.739
[INFO: train.py: 308]: [train] ade: 7.910
[INFO: train.py: 308]: [train] ade_l: 16.640
[INFO: train.py: 308]: [train] ade_nl: 15.079
[INFO: train.py: 308]: [train] d_loss: 1.386
[INFO: train.py: 308]: [train] fde: 11.827
[INFO: train.py: 308]: [train] fde_l: 24.878
[INFO: train.py: 308]: [train] fde_nl: 22.545
[INFO: train.py: 308]: [train] g_l2_loss_abs: 22.704
[INFO: train.py: 308]: [train] g_l2_loss_rel: 22.704
[INFO: train.py: 315]: New low for avg_disp_error
[INFO: train.py: 321]: New low for avg_disp_error_nl
[INFO: train.py: 335]: Saving checkpoint to /home/hcui2/clones/sgan/checkpoint_with_model.pt
[INFO: train.py: 337]: Done.
[INFO: train.py: 343]: Saving checkpoint to /home/hcui2/clones/sgan/checkpoint_no_model.pt
[INFO: train.py: 354]: Done.
[INFO: train.py: 278]: t = 206 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 233]: Starting epoch 16
[INFO: train.py: 278]: t = 211 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 216 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 221 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 233]: Starting epoch 17
[INFO: train.py: 278]: t = 226 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 231 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 236 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 233]: Starting epoch 18
[INFO: train.py: 278]: t = 241 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 246 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 251 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 233]: Starting epoch 19
[INFO: train.py: 278]: t = 256 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 261 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 266 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 233]: Starting epoch 20
[INFO: train.py: 278]: t = 271 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 276 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 233]: Starting epoch 21
[INFO: train.py: 278]: t = 281 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 286 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 291 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 233]: Starting epoch 22
[INFO: train.py: 278]: t = 296 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 278]: t = 301 / 4200
[INFO: train.py: 280]: [D] D_data_loss: 1.386
[INFO: train.py: 280]: [D] D_total_loss: 1.386
[INFO: train.py: 283]: [G] G_discriminator_loss: 0.693
[INFO: train.py: 283]: [G] G_total_loss: 0.693
[INFO: train.py: 294]: Checking stats on val ...
[INFO: train.py: 298]: Checking stats on train ...
[INFO: train.py: 305]: [val] ade: 7.662
[INFO: train.py: 305]: [val] ade_l: 16.870
[INFO: train.py: 305]: [val] ade_nl: 14.037
[INFO: train.py: 305]: [val] d_loss: 1.386
[INFO: train.py: 305]: [val] fde: 11.058
[INFO: train.py: 305]: [val] fde_l: 24.348
[INFO: train.py: 305]: [val] fde_nl: 20.260
[INFO: train.py: 305]: [val] g_l2_loss_abs: 21.739
[INFO: train.py: 305]: [val] g_l2_loss_rel: 21.739
[INFO: train.py: 308]: [train] ade: 7.797
[INFO: train.py: 308]: [train] ade_l: 16.335
[INFO: train.py: 308]: [train] ade_nl: 14.918
[INFO: train.py: 308]: [train] d_loss: 1.386
[INFO: train.py: 308]: [train] fde: 11.681
[INFO: train.py: 308]: [train] fde_l: 24.471
[INFO: train.py: 308]: [train] fde_nl: 22.348
[INFO: train.py: 308]: [train] g_l2_loss_abs: 22.124
[INFO: train.py: 308]: [train] g_l2_loss_rel: 22.124
[INFO: train.py: 315]: New low for avg_disp_error
[INFO: train.py: 321]: New low for avg_disp_error_nl
[INFO: train.py: 335]: Saving checkpoint to /home/hcui2/clones/sgan/checkpoint_with_model.pt
[INFO: train.py: 337]: Done.
Hi, I have the same problem. Did you end up fixing this issue? Did training until t = 4200 helps?
No luck :(
activation using relu or leakyrelu
did you try to use a larger learning rate like e.g. 1e-3? try to reuse hyperparameters from run_traj.sh maybe it will fix your problem
I have finally figured out the issue. You need to train with the run_traj.sh
script. The default arguments in train.py
don't work. The most important argument is --l2_loss_weight 1
which adds the L2 loss to the generator. Social-GAN needs the L2 loss to train and doesn't work with GAN loss only.
I have finally figured out the issue. You need to train with the
run_traj.sh
script. The default arguments intrain.py
don't work. The most important argument is--l2_loss_weight 1
which adds the L2 loss to the generator. Social-GAN needs the L2 loss to train and doesn't work with GAN loss only.
Hi, I have the same problem. I set --l2_loss_weight
to 1. But the G loss and D loss keep unchanged still(1.386 and 0.693 respectively) and l2 loss keeps changing. Do you know how to fix it?