SESR
SESR copied to clipboard
Bad artifacts in output image~~~~
- The below image coms from "out" directory after running
CUDA_VISIBLE_DEVICES=1 python test.py --cuda --model=model/model_div2k.pth
. - The weight
model_div2k.pth
you give can result the PSNR result as you report inreadme.txt
, however the output images seems not as good as your PSNR result.
To solve this bug, simply edit line147-149
- remove code like 'np.uint8()'
- convert to bgr and use opencv for output
@opteroncx I am trying it. Thanks for your tips. Could you share more information about your training command?
Use div2k, Random rotate or flip the input matrix in dataloader (augmentation) train with 32x32 input for 200 epochs, set lr=1e-4 and decrease 10 times after 100 epochs. then use 48x48 input to finetuning, this will gain about 0.3dB
for example:
def data_augment(im,num):
org_image = im.transpose(1,2,0)
if num ==0:
ud_image = np.flipud(org_image)
tranform = ud_image
elif num ==1:
lr_image = np.fliplr(org_image)
tranform = lr_image
elif num ==2:
lr_image = np.fliplr(org_image)
lrud_image = np.flipud(lr_image)
tranform = lrud_image
elif num ==3:
rotated_image1 = np.rot90(org_image)
tranform = rotated_image1
elif num ==4:
rotated_image2 = np.rot90(org_image, -1)
tranform = rotated_image2
elif num ==5:
rotated_image1 = np.rot90(org_image)
ud_image1 = np.flipud(rotated_image1)
tranform = ud_image1
elif num ==6:
rotated_image2 = np.rot90(org_image, -1)
ud_image2 = np.flipud(rotated_image2)
tranform = ud_image2
else:
tranform = org_image
tranform = tranform.transpose(2,0,1)
return tranform
@opteroncx thanks for your tips very on image displaying. I try to replace the code in line147-149
img = np.uint8(img)
im = Image.fromarray(img)
im.save(out_path+name[:-4]+'.png')
with the following code:
img_norm = cv2.normalize(img, img, alpha=0, beta=255, norm_type=cv2.NORM_MINMAX, dtype= cv2.CV_8U)
img_bgr = cv2.cvtColor(img_norm, cv2.COLOR_RGB2BGR)
cv2.imwrite(out_path+name[:-4]+'.png', img_bgr)
Thus, the picture result will be save correctly without error pixels
@opteroncx
Use div2k,
Random rotate or flip the input matrix in dataloader (augmentation)
train with 32x32 input for 200 epochs, set lr=1e-4 and decrease 10 times after 100 epochs.
then use 48x48 input to finetuning, this will gain about 0.3dB
Is that mean the steps I should obey is the following steps ? If something wrong, please point out, thanks!
- feed
DIV2K_train_HR
into LapSRN' s generate_train_lap_pry.m script setting itsstride = 64
to generate32x32
input image patches - call
def data_augment(im,num)
function aftertraining_data_loader = DataLoader()
- use command
python main.py --cuda --batchSize=64 --nEpochs=200 --lr=1e-4 --step=100
to execute the first round of traing - Then repeat step 1 with
stride = 96
to generate48x48
input image patches
I think your suggestions will work as you say, but what is the finetuning setting
of next step after the first round of training?
@opteroncx Hi?
Decrease the learning rate to 1e-5, and step set to 10