DocShadow-SD7K Question with the config file for Jung dataset

Dear author, thanks for sharing the great work! I have a question with the config file and the results. I tried to reproduce the result on Jung data reported in the table 5 of the paper, but I cannot achieve similar SSIM nor RMSE by using the default config file. The best SSIM and RMSE I got are 84.5 and 21.7, respectively, which are kinda far from the values reported in the paper.

Could you please share the configs for training on Jung dataset? Or if there are something else I need to adjust? Really appreciate your help on this issue.

For your reference, here are the parameters I use in the config file for training on the Jung's dataset:

VERBOSE: True

MODEL:
  SESSION: 'Jung'  # Define your current task here

# Optimization arguments.
OPTIM:
  BATCH_SIZE: 1
  NUM_EPOCHS: 300
  LR_INITIAL: 2e-4
  LR_MIN: 1e-6
  SEED: 3407
  WANDB: False

TRAINING:
  VAL_AFTER_EVERY: 1
  RESUME: False
  PS_W: 512
  PS_H: 512
  TRAIN_DIR: '/users/bxiao/DocShadow-SD7K/data/Jung/train/' # path to training data
  VAL_DIR: '/users/bxiao/DocShadow-SD7K/data/Jung/test/'    # path to validation data
  SAVE_DIR: './checkpoints/'     # path to save models and images
  ORI: True

TESTING:
  WEIGHT: './checkpoints/best.pth'
  SAVE_IMAGES: True

Here is a screenshot of the result: Screenshot 2024-05-01 at 11 34 47 AM

May 01 '24 18:05 baicenxiao

Hi, the result in table 5 was tested on 512 x 512 resolution. Here the ORI: True means you are testing on 2k resolution, where every method will not perform well.

I recommend you set ORI: False and save the best checkpoint, then test using infer.py to see if the result gets better. Otherwise, you can download our checkpoint release.

It has been late in my time zone, so I will get back to you tomorrow.

May 01 '24 18:05 zinuoli

Thank you so much for the prompt response! I have set ORI: False, retrained and saved the best model. It gives better result, but still not able to achieve that good PSNR and RMSE shown in the paper. Here are the new results by running python infer.py after retraining: PSNR: 21.928733825683594, SSIM: 0.8759468197822571, RMSE: 20.85166358947754

In the paper, the values are: PSNR: 24.36, SSIM: 0.88, RMSE: 16.04

I used the data from this link you shared (67 images for training and the 20 images for testing).

May 01 '24 20:05 baicenxiao

Hi, no worries, let me retrain here and get back to you.

May 02 '24 00:05 zinuoli

Hi, I retrained on my laptop (3080ti instead of A6000) and got the similar result as paper. Performance may differ from different devices, but at least it's not that far. I am not sure if it's because the version difference of each package...

May 02 '24 06:05 zinuoli

Thanks! Let me try on other server. Did you use the same parameters (except for the batch_size) like below to train on SD7K dataset? I may try on SD7K as well.

OPTIM:
  BATCH_SIZE: 1
  NUM_EPOCHS: 300
  LR_INITIAL: 2e-4
  LR_MIN: 1e-6
  SEED: 3407
  WANDB: False

TRAINING:
  VAL_AFTER_EVERY: 1
  RESUME: False
  PS_W: 512
  PS_H: 512
  TRAIN_DIR: '/users/bxiao/DocShadow-SD7K/data/train/' # path to training data
  VAL_DIR: '/users/bxiao/DocShadow-SD7K/data/test/'    # path to validation data
  SAVE_DIR: './checkpoints/'     # path to save models and images
  ORI: False

May 02 '24 14:05 baicenxiao

For SD7K the batchsize is 4, others are the same, you can also refer to our wandb log. We released comprehensive log when training SD7K.

May 02 '24 14:05 zinuoli

Appreciate your response @zinuoli. I just tried Google Colab to train the model on Jung dataset this time, the results are similar to the ones I got from my local machine and still not able to push RMSE to below 20.

Here is the link to Colab notebook. You may check the training logs at the bottom of the notebook. I did not change other part except ORI: False in config.yml file.

May 02 '24 18:05 baicenxiao

Hi @baicenxiao, we will get back to you after 5 May since my another member is on vacation right now. Let me discuss with him to see how we can solve your issue.

May 03 '24 01:05 zinuoli

Thanks for letting me know @zinuoli. The paper mentions that smoothL1 loss was used for training. But as I notice, in the code (this line), it is MSEloss. Did you change to use MSEloss due to better performance?

May 03 '24 16:05 baicenxiao

Thanks for letting me know @zinuoli. The paper mentions that smoothL1 loss was used for training. But as I notice, in the code (this line), it is MSEloss. Did you change to use MSEloss due to better performance?

MSELoss and SmoothL1Loss have similar performance, but MSELoss converges faster so you can actually train fewer epoches on large datasets such as SD7K and RDD.

May 04 '24 00:05 xuhangc

@baicenxiao

I have retrained the model for 50 times and shared the corresponding wandb log log for your reference, with the selection of highest performance. Upon analyzing the results, it appears that the model's performance exhibits significant variability. This can be attributed to the limited size of the Jung dataset, which may cause substantial fluctuations in the gradient during training.

To mitigate this issue and achieve more consistent and reliable results, I recommend considering the use of larger datasets such as SD7K or RDD. These datasets provide a more comprehensive and diverse set of training examples, which can help stabilize the training process and improve the model's generalization capabilities.

By leveraging these larger datasets, you can expect to obtain more robust and accurate results, reducing the impact of dataset-specific idiosyncrasies on the model's performance. What's more, you can also consider try our data augmentation method, which will help the model training.

Please let me know if you have any further questions or if there's anything else I can assist you with.

May 08 '24 01:05 xuhangc

DocShadow-SD7K DocShadow-SD7K copied to clipboard

Question with the config file for Jung dataset

DocShadow-SD7K
DocShadow-SD7K copied to clipboard