loss
Hello,
I hope you're doing well. I would appreciate your insights regarding some loss values I observed during training. Specifically, I encountered the following values for v_num=12:
val_loss = -1.61 val_d_loss = -1.72 val_d_loss0 = -1.72 val_g_loss = 0.109 val_rec_loss = 0.116 val_adv_loss = -0.00658 I am curious if these values seem reasonable, especially given the negative values for val_loss and val_d_loss. I would appreciate any guidance on whether this behavior is expected or if it may indicate an issue with my setup or training process.
Thank you in advance for your help!
Best regards, Zack
Thank you for the amazing work your team has done and for sharing your contributions in Facial Expression Recognition (FER). I was wondering if it would be possible for you to provide the training and inference code for LP (Linear Probing) or FT (Fine-Tuning) used in your FER experiments. It would be incredibly helpful for furthering my understanding and experiments.
Hi Zack,
I am curious if these values seem reasonable, especially given the negative values for val_loss and val_d_loss. I would appreciate any guidance on whether this behavior is expected or if it may indicate an issue with my setup or training process.
The d_loss is the WGAN loss which is fine to be negative.
$L_{adv}^{(d)} = discriminator(fake\ samples) - discriminator(real\ samples)$
Code for FER.
The code for FER is very similar to the code for attributes classification, as it's just adding the MARLIN encoder and a linear layer. You can easily convert the code for attribute classification to FER.
Regarding Negative Loss Values Thank you for your explanation. I would like to clarify my observations when running the MARLIN training code. After training for 2000 epochs, the checkpoint saved is either the one with the minimum loss or from the final training epoch. I noticed that the minimum loss recorded was -1.777 at epoch 5, while the loss at the final epoch was -0.0013. Could you please confirm if, for MARLIN, having a smaller (more negative) loss is better, or if a loss closer to zero is more desirable? Otherwise, it seems that training for 2000 epochs might not yield much benefit. I would appreciate your clarification. Thank you.
Attribute Classification Code and CMU-MOSEI Dataset I could not find the attribute classification code on your GitHub repository. Additionally, regarding the CMU-MOSEI dataset, did you use mosei_senti_data.pkl or mosei.hdf5? Currently, I can only find four versions: mosei_raw.pkl, mosei_senti_data_pkl, mosei_unalign.hdf5, and mosei.hdf5. Could you please guide me on which dataset to use and how to utilize it? If possible, I would greatly appreciate it if you could provide your code to facilitate my learning. Thank you again for your support.
Thank you sincerely for your timely response and for addressing my questions. I truly appreciate your support and valuable insights.
** Should the save loss be set this way?**
checkpoint_callback = ModelCheckpoint(
dirpath="checkpoints",
filename="best_generator_model",
monitor="val_g_loss",
mode="min",
save_top_k=1,
verbose=True
)
please you tell me