mmagic icon indicating copy to clipboard operation
mmagic copied to clipboard

Problem on UDM10 dataset

Open wdmwhh opened this issue 2 years ago • 2 comments

Hello, thanks to your great work. I encounter difficulies when using this toolbox. I can reproduce the results on Vid4 but fail on UDM10 dataset.

Method Source UDM10 (BDx4) PSNR/SSIM (Y)
basicvsr_vimeo90k_bd reported 39.9953/0.9695
basicvsr_vimeo90k_bd test from ckp 27.1313/0.8327
basicvsr_plusplus_c64n7_4x2_300k_vimeo90k_bd reported 40.7216/0.9722
basicvsr_plusplus_c64n7_4x2_300k_vimeo90k_bd test from ckp 27.1293/0.8327

I find that the restored image has minor offsets relative to the ground truth.

UDM10 dataset is constructed according to https://github.com/cszn/KAIR/blob/master/docs/README_VRT.md. In specific, udm10.zip is downloaded from https://www.terabox.com/web/share/link?surl=LMuQCVntRegfZSxn7s3hXw&path=%2Fproject%2Fpfnl and then is processed by prepare_UDM10.py.

Could you release the UDM10 dataset like Vid4 ? Thanks again.

wdmwhh avatar Mar 18 '22 12:03 wdmwhh

Hello @wdmwhh sorry for the late reply. The pixel shift is the main reason why the PSNR is so low. You can use the MATLAB script here to generate the LR data. I will release the data later too.

ckkelvinchan avatar Mar 31 '22 14:03 ckkelvinchan

If I am not mistaken, here is a code that should have the same effect as @ckkelvinchan 's Matlab script. Copyright: I reused parts of code from mmedit.

#Original licence: Copyright (c) 2020 xinntao, under the Apache 2.0 license.
import numpy as np
from cv2 import filter2D, imread, imwrite
import glob
import sys
import os

def get_rotated_sigma_matrix(sig_x, sig_y, theta):
    """Calculate the rotated sigma matrix (two dimensional matrix).

    Args:
        sig_x (float): Standard deviation along the horizontal direction.
        sig_y (float): Standard deviation along the vertical direction.
        theta (float): Rotation in radian.

    Returns:
        ndarray: Rotated sigma matrix.
    """

    diag = np.array([[sig_x**2, 0], [0, sig_y**2]]).astype(np.float32)
    rot = np.array([[np.cos(theta), -np.sin(theta)],
                    [np.sin(theta), np.cos(theta)]]).astype(np.float32)

    return np.matmul(rot, np.matmul(diag, rot.T))


def _mesh_grid(kernel_size):
    """Generate the mesh grid, centering at zero.

    Args:
        kernel_size (int): The size of the kernel.

    Returns:
        x_grid (ndarray): x-coordinates with shape (kernel_size, kernel_size).
        y_grid (ndarray): y-coordiantes with shape (kernel_size, kernel_size).
        xy_grid (ndarray): stacked coordinates with shape
            (kernel_size, kernel_size, 2).
    """

    range_ = np.arange(-kernel_size // 2 + 1., kernel_size // 2 + 1.)
    x_grid, y_grid = np.meshgrid(range_, range_)
    xy_grid = np.hstack((x_grid.reshape((kernel_size * kernel_size, 1)),
                         y_grid.reshape(kernel_size * kernel_size,
                                        1))).reshape(kernel_size, kernel_size,
                                                     2)

    return xy_grid, x_grid, y_grid

def calculate_gaussian_pdf(sigma_matrix, grid):
    """Calculate PDF of the bivariate Gaussian distribution.

    Args:
        sigma_matrix (ndarray): The variance matrix with shape (2, 2).
        grid (ndarray): Coordinates generated by :func:`_mesh_grid`,
            with shape (K, K, 2), where K is the kernel size.

    Returns:
        kernel (ndarrray): Un-normalized kernel.
    """

    inverse_sigma = np.linalg.inv(sigma_matrix)
    kernel = np.exp(-0.5 * np.sum(np.matmul(grid, inverse_sigma) * grid, 2))

    return kernel

def bivariate_gaussian(kernel_size,
                       sig_x,
                       sig_y=None,
                       theta=None,
                       grid=None,
                       is_isotropic=True):
    """Generate a bivariate isotropic or anisotropic Gaussian kernel.

    In isotropic mode, only `sig_x` is used. `sig_y` and `theta` are
    ignored.

    Args:
        kernel_size (int): The size of the kernel
        sig_x (float): Standard deviation along horizontal direction.
        sig_y (float | None, optional): Standard deviation along the vertical
            direction. If it is None, 'is_isotropic' must be set to True.
            Default: None.
        theta (float | None, optional): Rotation in radian. If it is None,
            'is_isotropic' must be set to True. Default: None.
        grid (ndarray, optional): Coordinates generated by :func:`_mesh_grid`,
            with shape (K, K, 2), where K is the kernel size. Default: None
        is_isotropic (bool, optional): Whether to use an isotropic kernel.
            Default: True.

    Returns:
        kernel (ndarray): normalized kernel (i.e. sum to 1).
    """

    if grid is None:
        grid, _, _ = _mesh_grid(kernel_size)

    if is_isotropic:
        sigma_matrix = np.array([[sig_x**2, 0], [0, sig_x**2]]).astype(np.float32)
    else:
        if sig_y is None:
            raise ValueError('"sig_y" cannot be None if "is_isotropic" is False.')

        sigma_matrix = get_rotated_sigma_matrix(sig_x, sig_y, theta)

    kernel = calculate_gaussian_pdf(sigma_matrix, grid)
    kernel = kernel / np.sum(kernel)

    return kernel
    
    
if __name__ == '__main__':
    # Give as argument the path to a folder containing a video sequence
    if not os.path.exists('output_lr/'+sys.argv[1].split('/')[-1]):
        os.mkdir('output_lr/'+sys.argv[1].split('/')[-1])
    
    scale = 4
    sigma = 1.6
    kernel_size = 12

    kernel = bivariate_gaussian(kernel_size, sigma, sigma, 0, is_isotropic=True)
    kernel = kernel / np.sum(kernel)

    for filename in sorted(glob.glob(sys.argv[1]+"/*.png")):
        img = imread(filename)
        height, width, layers = img.shape
        size = (width,height)
        
        lq = filter2D(img, -1, kernel)
        lq = lq[scale//2-1:-scale//2+1:scale, scale//2-1:-scale//2+1:scale, :]
        splited = filename.split('/')
        imwrite('output_lr/'+splited[-2]+'/'+splited[-1], lq)
    print("finished with", filename)

Surprisingly it gives almost the same BIx4 images than the one from the udm10 version I downloaded a few month ago. There is only 327 difference in pixel value in udm10/BIx4/archpeople/00000003.png and these differences consist of +1 or -1 in one color channel i.e. the pixels at matrix coordinate (0,114) is [57 71 88] in the downloaded version and [56 71 88] in the version produced by my code.

gauone avatar Apr 22 '22 10:04 gauone

Please check this issue. @Z-Fran

zengyh1900 avatar Oct 09 '22 12:10 zengyh1900

Thank @gauone . Closing due to inactivity, please reopen if there are any further problems @wdmwhh .

Z-Fran avatar Oct 11 '22 05:10 Z-Fran