Real-ESRGAN Fine-tuning at x8 up-sampling

Hello,

The general image up-sampling model RealESRGAN_x4plus is doing a good job and it makes fine-tuning rather easy. I wonder, does anyone have checkpoints (generator and discriminator) and configurations available for up-sampling by x8 please ?

The performance does not need to be optimal as I would also like to try fine-tuning on my own data but this time for x8.

Best regards.

Jan 24 '22 20:01 adrienchaton

I havet problems finetuning the model, can you send how you prepared the data? And also how you formatted the finetune_realesrgan_4x_pairdata.yml I don't know it I have done somerhing wrong

Jan 25 '22 18:01 darvida

you can check this https://github.com/xinntao/Real-ESRGAN/blob/master/Training.md you just need to put a folder of raw images, run the multiscale script to rescale (optional), run the script to create the meta data info txt file (basically its just lines with the file names) and edit the yml to specify your meta data info file, where the data is located and where are the pretrained weights. its all explained in the training.md !

Jan 25 '22 20:01 adrienchaton

You can also finetune using x4 model though you will need to edit config file

strict_load_g: true
strict_load_d: true

You need set them both to false.

Jan 25 '22 21:01 kodxana

Thanks @kodxana for the reply, that is great to hear that there would be such way to do that from x4 checkpoints.

However I am not sure how this works, isn't the x4 generator missing parameters for some kind of up-sampling block to go to x8 ?

Anyway, I tried to modify the training option as recommended, strict_load_g: false strict_load_d: false scale: 8 and I get the following error:

2022-01-26 01:17:50,291 INFO: Loading UNetDiscriminatorSN model from experiments/pretrained_models/RealESRGAN_x4plus_netD.pth, with param key: [params].
2022-01-26 01:17:50,322 INFO: Loss [L1Loss] is created.
2022-01-26 01:17:52,572 INFO: Loss [PerceptualLoss] is created.
2022-01-26 01:17:52,593 INFO: Loss [GANLoss] is created.
2022-01-26 01:17:52,618 INFO: Model [RealESRGANModel] is created.
2022-01-26 01:17:52,931 INFO: Start training from epoch: 0, iter: 0
/localscratch/abitton.25351637.0/env/lib/python3.7/site-packages/torch/nn/functional.py:3458: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the d>
  "See the documentation of nn.Upsample for details.".format(mode)
/localscratch/abitton.25351637.0/env/lib/python3.7/site-packages/torch/nn/functional.py:3503: UserWarning: The default behavior for interpolate/upsample with float scale_factor changed in 1.6.0 to align with other frameworks/libraries, and now uses scale_factor directly, inst>
  "The default behavior for interpolate/upsample with float scale_factor changed "
/localscratch/abitton.25351637.0/env/lib/python3.7/site-packages/torch/nn/functional.py:3458: UserWarning: Default upsampling behavior when mode=bicubic is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the do>
  "See the documentation of nn.Upsample for details.".format(mode)
/localscratch/abitton.25351637.0/env/lib/python3.7/site-packages/basicsr/losses/losses.py:16: UserWarning: Using a target size (torch.Size([12, 3, 256, 256])) that is different to the input size (torch.Size([12, 3, 128, 128])). This will likely lead to incorrect results due t>
  return F.l1_loss(pred, target, reduction='none')
Traceback (most recent call last):
  File "realesrgan/train.py", line 11, in <module>
    train_pipeline(root_path)
  File "/localscratch/abitton.25351637.0/env/lib/python3.7/site-packages/basicsr/train.py", line 169, in train_pipeline
    model.optimize_parameters(current_iter)
  File "/scratch/abitton/abitton/MM/Real-ESRGAN-original/realesrgan/models/realesrgan_model.py", line 215, in optimize_parameters
    l_g_pix = self.cri_pix(self.output, l1_gt)
  File "/localscratch/abitton.25351637.0/env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/localscratch/abitton.25351637.0/env/lib/python3.7/site-packages/basicsr/losses/losses.py", line 54, in forward
    return self.loss_weight * l1_loss(pred, target, weight, reduction=self.reduction)
  File "/localscratch/abitton.25351637.0/env/lib/python3.7/site-packages/basicsr/losses/loss_util.py", line 91, in wrapper
    loss = loss_func(pred, target, **kwargs)
  File "/localscratch/abitton.25351637.0/env/lib/python3.7/site-packages/basicsr/losses/losses.py", line 16, in l1_loss
    return F.l1_loss(pred, target, reduction='none')
  File "/localscratch/abitton.25351637.0/env/lib/python3.7/site-packages/torch/nn/functional.py", line 2897, in l1_loss
    expanded_input, expanded_target = torch.broadcast_tensors(input, target)
  File "/localscratch/abitton.25351637.0/env/lib/python3.7/site-packages/torch/functional.py", line 74, in broadcast_tensors
    return _VF.broadcast_tensors(tensors)  # type: ignore
RuntimeError: The size of tensor a (128) must match the size of tensor b (256) at non-singleton dimension 3

It seems the prediction and target are not of same shape so that L1 loss cannot be computed ... what else am I doing wrong please ?

Jan 26 '22 09:01 adrienchaton

I have the default gt_size: 256 and in L1 loss pred is of shape 128 and target gt is of shape 256 it means the generator is still doing x4 upsampling after I modify the options of strict_load_g: false strict_load_d: false scale: 8

Jan 26 '22 09:01 adrienchaton

Not sure what I mostly use to finetune is generating pair dataset. Not used generation one.

Jan 26 '22 09:01 kodxana

okay @kodxana, please could you maybe send me an example .yml option file that you use for finetuning from RealESRGAN_x4plus checkpoint to x8 ?

then I can double check if there is not a setting that I forget to modify and cause the problem

Jan 26 '22 09:01 adrienchaton

Can you send an example of your RealESRGAN_x4plus for x4?

Jan 26 '22 09:01 darvida

Can you send an example of your RealESRGAN_x4plus for x4?

please read the training.md, there is about nothing to change

just give the values you want to name, dataroot_gt, meta_info, pretrain_network_g, pretrain_network_d

# general settings
name: finetune_RealESRGANx4plus_300k_3DS_multiscale_jpg #finetune_RealESRGANx4plus_400k
model_type: RealESRGANModel
scale: 4
num_gpu: auto
manual_seed: 0

# ----------------- options for synthesizing training data in RealESRGANModel ----------------- #
# USM the ground-truth
l1_gt_usm: True
percep_gt_usm: True
gan_gt_usm: False

# the first degradation process
resize_prob: [0.2, 0.7, 0.1]  # up, down, keep
resize_range: [0.15, 1.5]
gaussian_noise_prob: 0.5
noise_range: [1, 30]
poisson_scale_range: [0.05, 3]
gray_noise_prob: 0.4
jpeg_range: [30, 95]

# the second degradation process
second_blur_prob: 0.8
resize_prob2: [0.3, 0.4, 0.3]  # up, down, keep
resize_range2: [0.3, 1.2]
gaussian_noise_prob2: 0.5
noise_range2: [1, 25]
poisson_scale_range2: [0.05, 2.5]
gray_noise_prob2: 0.4
jpeg_range2: [30, 95]

gt_size: 256
queue_size: 180

# dataset and data loader settings
datasets:
  train:
    name: DF2K+OST
    type: RealESRGANDataset
    dataroot_gt: /home/abitton/scratch/abitton/MM
    meta_info: meta_info/meta_info_3_datasets_multiscale_jpg.txt
    io_backend:
      type: disk

    blur_kernel_size: 21
    kernel_list: ['iso', 'aniso', 'generalized_iso', 'generalized_aniso', 'plateau_iso', 'plateau_aniso']
    kernel_prob: [0.45, 0.25, 0.12, 0.03, 0.12, 0.03]
    sinc_prob: 0.1
    blur_sigma: [0.2, 3]
    betag_range: [0.5, 4]
    betap_range: [1, 2]

    blur_kernel_size2: 21
    kernel_list2: ['iso', 'aniso', 'generalized_iso', 'generalized_aniso', 'plateau_iso', 'plateau_aniso']
    kernel_prob2: [0.45, 0.25, 0.12, 0.03, 0.12, 0.03]
    sinc_prob2: 0.1
    blur_sigma2: [0.2, 1.5]
    betag_range2: [0.5, 4]
    betap_range2: [1, 2]

    final_sinc_prob: 0.8

    gt_size: 256
    use_hflip: True
    use_rot: False

    # data loader
    use_shuffle: true
    num_worker_per_gpu: 5
    batch_size_per_gpu: 12
    dataset_enlarge_ratio: 1
    prefetch_mode: ~

  # Uncomment these for validation
  # val:
  #   name: validation
  #   type: PairedImageDataset
  #   dataroot_gt: path_to_gt
  #   dataroot_lq: path_to_lq
  #   io_backend:
  #     type: disk

# network structures
network_g:
  type: RRDBNet
  num_in_ch: 3
  num_out_ch: 3
  num_feat: 64
  num_block: 23
  num_grow_ch: 32

network_d:
  type: UNetDiscriminatorSN
  num_in_ch: 3
  num_feat: 64
  skip_connection: True

# path
path:
  # use the pre-trained Real-ESRNet model
  pretrain_network_g: experiments/pretrained_models/RealESRGAN_x4plus.pth
  param_key_g: params_ema
  strict_load_g: true
  pretrain_network_d: experiments/pretrained_models/RealESRGAN_x4plus_netD.pth
  param_key_d: params
  strict_load_d: true
  resume_state: ~

# training settings
train:
  ema_decay: 0.999
  optim_g:
    type: Adam
    lr: !!float 1e-4
    weight_decay: 0
    betas: [0.9, 0.99]
  optim_d:
    type: Adam
    lr: !!float 1e-4
    weight_decay: 0
    betas: [0.9, 0.99]

  scheduler:
    type: MultiStepLR
    milestones: [300000] #[400000]
    gamma: 0.5

  total_iter: 300000 #400000
  warmup_iter: -1  # no warm up

  # losses
  pixel_opt:
    type: L1Loss
    loss_weight: 1.0
    reduction: mean
  # perceptual loss (content and style losses)
  perceptual_opt:
    type: PerceptualLoss
    layer_weights:
      # before relu
      'conv1_2': 0.1
      'conv2_2': 0.1
      'conv3_4': 1
      'conv4_4': 1
      'conv5_4': 1
    vgg_type: vgg19
    use_input_norm: true
    perceptual_weight: !!float 1.0
    style_weight: 0
    range_norm: false
    criterion: l1
  # gan loss
  gan_opt:
    type: GANLoss
    gan_type: vanilla
    real_label_val: 1.0
    fake_label_val: 0.0
    loss_weight: !!float 1e-1

  net_d_iters: 1
  net_d_init_iters: 0

# Uncomment these for validation
# validation settings
# val:
#   val_freq: !!float 5e3
#   save_img: True

#   metrics:
#     psnr: # metric name
#       type: calculate_psnr
#       crop_border: 4
#       test_y_channel: false

# logging settings
logger:
  print_freq: 100
  save_checkpoint_freq: !!float 5e3
  use_tb_logger: true
  wandb:
    project: ~
    resume_id: ~

# dist training settings
dist_params:
  backend: nccl
  port: 29500

Jan 26 '22 10:01 adrienchaton

hello @xinntao ; if you have a moment could you please explain how to finetune at x8 from your x4 checkpoint without using a paired dataset but by generating training images on the fly ?

I modify finetune_realesrgan_x4plus.yml with scale: 8 strict_load_g: false strict_load_d: false and keep the default gt_size: 256

as shown on the previous messages of this issue, when the loss is computed it seems that the target is of size 256 while the generated input to the loss is of size 128, meaning that it seems the generator is still processing a x4 upsampling

since the model is a ResNet, I would think that the model itself has no upsampling layers but that the degraded input is already upsampled to target size and then passed in the model for sharpening

according to that I think that the input pre-processing is still configured to x4 before feeding the model but I dont understand how I could set that it is at scale x8

thanks !

Jan 26 '22 15:01 adrienchaton

Hello @adrienchaton, I am also having the same problem when training the model for scale set to 8. I have tried both paired gt and lq and lq generation at the time of training methods with scale of 8. But it is giving me this error "The size of tensor a (128) must match the size of tensor b (256) at non-singleton dimension 3" when calculating L1 loss. Have you tried with paired data and got it to work for scale 8?

Jan 31 '22 02:01 ShawkhIbneRashid

I didnt get to finetune a x4 model to x8 and I dont think it is actually possible. It would be nice to have other people feedbacks but I will tell why it is not that straightforward to me.

Looking into realsergan/models/realesrgan_model.py in optimize_parameters, if I set scale to 8 I get the following shapes for a gt_size = 256.

lq.shape = batch, 3, 32, 32 that is 256/8 output.shape = batch, 3, 128, 128 that is only x4 gt.shape = batch, 3, 256, 256 that is indeed the x8 target

This means it is not an issue of data preparation and does not matter whether dataset is paired or not. But the fact that the pretrained generator has a fixed upsampling of x4.

Maybe we could play around with basicsr.models.srgan_model to see if the upsampling may be modified or if it is learned it just has to be trained at x8 initially ...

Any thoughts on that would be welcome, as far as now I just finetune at x4 ...

Jan 31 '22 11:01 adrienchaton

So you saying when you generate image with trained model you get x4 scale?

Jan 31 '22 12:01 kodxana

It seems like, at line 208 (realsergan/models/realesrgan_model.py in optimize_parameters) self.output = self.net_g(self.lq) and the generated output has shape 128 which is x4, even after I set scale to 8. Whereas the target is of shape 256 as expected for x8 ...

Jan 31 '22 12:01 adrienchaton

Please update if you find a way to fix this. I'll try finetuning the model to a different scale by just writing code myself in the meantime

Feb 01 '22 13:02 Steven-vanb

I am also interested in doing x8 scaling factors. Has anybody had success with this since the last comments ^^^?

May 04 '22 04:05 bellenfanttyler

hello, dear friends, I met with some problems when I tried to finetune the model.

I was using the GPU :3090 (My English is not good. If you also find some problems with my description, please point out directly. Thanks!)

when the user tries to finetune the model, must the user train the model first by following Readme.md?

I just skipped the training process and went directly to the finetune part, python realesrgan/train.py -opt options/finetune_realesrgan_x4plus_pairdata.yml --auto_resume (1) and I met with some problems, saying "ModuleNotFoundError: No module named 'realesrgan'" Then I searched the issue, but running the setup.py still did not work. (I followed the guideline here https://github.com/xinntao/Real-ESRGAN/issues/49)

Besides, at first, I did not run the code in the virtual environment. I just run the code in the cloud https://gpushare.com/.

however, when I ran the code in the virtual environment python3 -m venv tutorial-env source tutorial-env/bin/activate , I met with a new problem by running python realesrgan/train.py -opt options/finetune_realesrgan_x4plus_pairdata.yml --auto_resume, (1) which is the same code as before.

FileNotFoundError: [Errno 2] No such file or directory: 'experiments/pretrained_models/RealESRNet_x4plus.pth'

Right now, I am carefully reading Training.md to find the problem.

to start, I first finished the starting guideline https://github.com/xinntao/Real-ESRGAN Dependencies and Installation Python >= 3.7 PyTorch >= 1.7 Installation

Clone repo
Install dependent packages

May 07 '22 03:05 ZhengtianZhu

Hello, have you made any progress in X8 training

I encountered a mistake during the training of X3 and X8 RuntimeError: The size of tensor a (340) must match the size of tensor b (256) at non-singleton dimension 3

Jul 20 '22 14:07 sevnxiaolu

I have faced this issue also: FileNotFoundError: [Errno 2] No such file or directory: 'experiments/pretrained_models/RealESRNet_x4plus.pth' It worked when I changed the path in options/finetune_realesrgan_x4plus.yml from:
pretrain_network_g: experiments/pretrained_models/RealESRNet_x4plus.pth to: pretrain_network_g: experiments/pretrained_models/RealESRGAN_x4plus.pth

Aug 22 '22 09:08 mohamedabdallah1996

Real-ESRGAN Real-ESRGAN copied to clipboard

Fine-tuning at x8 up-sampling

Real-ESRGAN
Real-ESRGAN copied to clipboard