Initial performance very poor after training on P3M dataset.
I have tried training MODNet on the publicly available 30k-image dataset and the performance was poor due to the unclean dataset. Now, I've changed the dataset and chosen the P3M-10k dataset that uses 10k images with good-quality of segmentation. Now I'm fine-tuning the model using this code where I'm training on top of the existing modnet_photographic_portrait_matting.ckpt with backbone_pretrained = True.
My losses are however drastically high. I've seen losses to be as low as this but my losses are: Semantic Loss: 9.234, Detail Loss: 0.334, Matte Loss: 6.392,
and seem to roll around in the same vicinity. Could you please check the code and see if there's any mistake in my data preparation/augmentation or is my training code somewhat wildly incorrect?