OpenSDI Question about the training performance

Thanks so much for the release of code and dataset.

I tried to retrain MVSSNet and Trufor on OpenSDI dataset under the IMDLBenco framework. After 3 epochs, the test_pixel-level Accuracy increases a lot, but F1 and IoU drops to a lower value.

Is there anything I need to change in train.py? I noticed that you report IoU and F1 in the paper, should I change the validation metrics?

Jun 18 '25 13:06 Jenna-Bai

That's an excellent observation, and it is a classical pixel-level label imbalance problem. Since real pixels vastly outnumber manipulated ones in this problem, a model can achieve high accuracy early in training by simply predicting all as 'real'. F1 and IoU, on the other hand, are crucial because they evaluate the model's ability to find the fake pixels, which is our actual goal. What you're seeing is normal for the first few epochs; we should trust the F1/IoU scores as they are true performance indicators

Jun 19 '25 06:06 iamwangyabin

Could you please share about how many hours do you train these models?

I trained MVSSNet and SparseViT for a day using IMDLBenCo, but the performance does't improve and there is no difference between intra-dataset and cross-dataset.

I posted the performance below, could your please give me some advice?

Jun 23 '25 15:06 Jenna-Bai

Very, very slow. I remembered I trained Trufor for 2 days on a 4*H100 cluster. Your F1 on SD1.5 is a bit low. I suggest you check whether the train loss is still decreasing.

Jun 24 '25 05:06 iamwangyabin

Very, very slow. I remembered I trained Trufor for 2 days on a 4*H100 cluster. Your F1 on SD1.5 is a bit low. I suggest you check whether the train loss is still decreasing.

Thanks so much for your reply! I tried the training code for MaskCLIP, but all its losses suddenly turn to nan at epoch 8. Have you encountered such a problem and have any idea how to fix it?

Jul 02 '25 13:07 Jenna-Bai