DocShadow-SD7K
DocShadow-SD7K copied to clipboard
Question with the config file for Jung dataset
Dear author, thanks for sharing the great work! I have a question with the config file and the results. I tried to reproduce the result on Jung data reported in the table 5 of the paper, but I cannot achieve similar SSIM nor RMSE by using the default config file. The best SSIM and RMSE I got are 84.5 and 21.7, respectively, which are kinda far from the values reported in the paper.
Could you please share the configs for training on Jung dataset? Or if there are something else I need to adjust? Really appreciate your help on this issue.
For your reference, here are the parameters I use in the config file for training on the Jung's dataset:
VERBOSE: True
MODEL:
SESSION: 'Jung' # Define your current task here
# Optimization arguments.
OPTIM:
BATCH_SIZE: 1
NUM_EPOCHS: 300
LR_INITIAL: 2e-4
LR_MIN: 1e-6
SEED: 3407
WANDB: False
TRAINING:
VAL_AFTER_EVERY: 1
RESUME: False
PS_W: 512
PS_H: 512
TRAIN_DIR: '/users/bxiao/DocShadow-SD7K/data/Jung/train/' # path to training data
VAL_DIR: '/users/bxiao/DocShadow-SD7K/data/Jung/test/' # path to validation data
SAVE_DIR: './checkpoints/' # path to save models and images
ORI: True
TESTING:
WEIGHT: './checkpoints/best.pth'
SAVE_IMAGES: True
Here is a screenshot of the result:
Hi, the result in table 5 was tested on 512 x 512 resolution. Here the ORI: True
means you are testing on 2k resolution, where every method will not perform well.
I recommend you set ORI: False
and save the best checkpoint, then test using infer.py
to see if the result gets better. Otherwise, you can download our checkpoint release.
It has been late in my time zone, so I will get back to you tomorrow.
Thank you so much for the prompt response! I have set ORI: False
, retrained and saved the best model. It gives better result, but still not able to achieve that good PSNR and RMSE shown in the paper. Here are the new results by running python infer.py
after retraining:
PSNR: 21.928733825683594, SSIM: 0.8759468197822571, RMSE: 20.85166358947754
In the paper, the values are:
PSNR: 24.36, SSIM: 0.88, RMSE: 16.04
I used the data from this link you shared (67 images for training and the 20 images for testing).
Hi, no worries, let me retrain here and get back to you.
Hi, I retrained on my laptop (3080ti instead of A6000) and got the similar result as paper. Performance may differ from different devices, but at least it's not that far. I am not sure if it's because the version difference of each package...
Thanks! Let me try on other server. Did you use the same parameters (except for the batch_size
) like below to train on SD7K dataset? I may try on SD7K as well.
OPTIM:
BATCH_SIZE: 1
NUM_EPOCHS: 300
LR_INITIAL: 2e-4
LR_MIN: 1e-6
SEED: 3407
WANDB: False
TRAINING:
VAL_AFTER_EVERY: 1
RESUME: False
PS_W: 512
PS_H: 512
TRAIN_DIR: '/users/bxiao/DocShadow-SD7K/data/train/' # path to training data
VAL_DIR: '/users/bxiao/DocShadow-SD7K/data/test/' # path to validation data
SAVE_DIR: './checkpoints/' # path to save models and images
ORI: False
For SD7K the batchsize is 4, others are the same, you can also refer to our wandb log. We released comprehensive log when training SD7K.
Appreciate your response @zinuoli. I just tried Google Colab to train the model on Jung dataset this time, the results are similar to the ones I got from my local machine and still not able to push RMSE to below 20.
Here is the link to Colab notebook. You may check the training logs at the bottom of the notebook. I did not change other part except ORI: False
in config.yml file.
Hi @baicenxiao, we will get back to you after 5 May since my another member is on vacation right now. Let me discuss with him to see how we can solve your issue.
Thanks for letting me know @zinuoli. The paper mentions that smoothL1 loss was used for training. But as I notice, in the code (this line), it is MSEloss
. Did you change to use MSEloss
due to better performance?
Thanks for letting me know @zinuoli. The paper mentions that smoothL1 loss was used for training. But as I notice, in the code (this line), it is
MSEloss
. Did you change to useMSEloss
due to better performance?
MSELoss
and SmoothL1Loss
have similar performance, but MSELoss
converges faster so you can actually train fewer epoches on large datasets such as SD7K and RDD.
@baicenxiao
I have retrained the model for 50 times and shared the corresponding wandb log log for your reference, with the selection of highest performance. Upon analyzing the results, it appears that the model's performance exhibits significant variability. This can be attributed to the limited size of the Jung dataset, which may cause substantial fluctuations in the gradient during training.
To mitigate this issue and achieve more consistent and reliable results, I recommend considering the use of larger datasets such as SD7K or RDD. These datasets provide a more comprehensive and diverse set of training examples, which can help stabilize the training process and improve the model's generalization capabilities.
By leveraging these larger datasets, you can expect to obtain more robust and accurate results, reducing the impact of dataset-specific idiosyncrasies on the model's performance. What's more, you can also consider try our data augmentation method, which will help the model training.
Please let me know if you have any further questions or if there's anything else I can assist you with.