MPRNet icon indicating copy to clipboard operation
MPRNet copied to clipboard

about deblur trainning

Open Davidcoach opened this issue 3 years ago • 12 comments
trafficstars

hello, I degraded the image in FFHQ and want to use the debur process in MPRnet to restore it. But, when I train the model, I first met this problem Traceback (most recent call last): File "/home/ma-user/work/MPRnet/train.py", line 120, in loss_char = np.sum([criterion_char(restored[j],target) for j in range(len(restored))]) File "<array_function internals>", line 6, in sum File "/home/ma-user/anaconda3/envs/PyTorch-1.8/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 2260, in sum initial=initial, where=where) File "/home/ma-user/anaconda3/envs/PyTorch-1.8/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 86, in wrapreduction return ufunc.reduce(obj, axis, dtype, out, **passkwargs) File "/home/ma-user/anaconda3/envs/PyTorch-1.8/lib/python3.7/site-packages/torch/tensor.py", line 621, in array return self.numpy() TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first. So I changed the loss function to this loss_char = torch.tensor([criterion_char(restored[j],target) for j in range(len(restored))]).sum() loss_edge = torch.tensor([criterion_edge(restored[j],target) for j in range(len(restored))]).sum() loss = ((loss_char) + (0.05*loss_edge)).requires_grad(True) this seems work, but after trainning, I found that the PSNR is always 12.4697 and never change, the model learnd nothing from the data. How can I do?

Davidcoach avatar Jul 05 '22 03:07 Davidcoach

I also encounter the same problem. I want to ask you how to solve this problem in the end

KKKLeouee avatar Jul 20 '22 05:07 KKKLeouee

你好,在 FFHQ 降了一个图像,想用我在 Rnet 中的 debur 过程中 我训练问题来恢复。,当模型时,第一次遇到这个 Traceback(最近一次调用最后一次): 文件“/home /ma-user/work/MPRnet/train.py”,第 120 行,在 loss_char = np.sum([ criteria_char(restored[j],target) for j in range(len(restored))]) 文件“< array_function internals>”,第 6 行, 总和文件“/home/ma-user/anaconda3/envs/PyTorch-1.8 /lib/python3.7/site-packages/numpy/core/fromnumeric.py”,第 2260 行,总而言之其 initial=initial,where=where) 文件“/home/ma-user/anaconda3/envs/PyTorch-1.8/lib/python3.7/site-packages/numpy/core/fromnumeric.py”,第86行,在_wrapreduction 返回 ufunc.reduce(obj, axis, dtype, out, passkwargs) 文件“/home/ma-user/anaconda3/envs/PyTorch-1.8/lib/python3.7/site-packages/torch/tensor.py ”,第 621 行,在指令中 return self.numpy** () TypeError: can't convert cuda:0 device type tensor to numpy.首先使用 Tenor.cpu() 将张量复制到 主机 。 tensor([criterion_char(restored[j],target) for j in range(len(restored))]).sum() loss_edge = torch.tensor([criterion_edge(restored [j],target) for j in range(len (restoredd.sum() loss = (0.05*lossedge))._ require_grad ( 并且​​经过训练,我似乎发现 PSNR 但这是 12.4697真实)能做 什么?

I also encounter the same problem. I want to ask you how to solve this problem in the end

KKKLeouee avatar Jul 20 '22 06:07 KKKLeouee

你好,我发现是warmup_scheduler的问题,你可以试试删去它或者调大max epoch

------------------ Original message ------------------ From: "KKKLeouee"; Sendtime: Wednesday, Jul 20, 2022 2:05 PM To: "swz30/MPRNet"; Cc: "薛文 @.***>; "Author"; Subject: Re: [swz30/MPRNet] about deblur trainning (Issue #117)

我也遇到了同样的问题。我想问你这个到底怎么解决的问题

1212121212

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Davidcoach avatar Jul 20 '22 07:07 Davidcoach

This might help https://github.com/swz30/MPRNet/issues/91#issuecomment-972653203

adityac8 avatar Jul 20 '22 09:07 adityac8

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first. 请问这个问题怎么解决 我和楼主问题一样 也是用的python3.7

userHLN avatar Oct 23 '22 12:10 userHLN

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first. 请问这个问题怎么解决 我和楼主问题一样 也是用的python3.7

请问你解决了吗?我也遇到了这个问题

Makohhh avatar Oct 31 '22 07:10 Makohhh

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first. 请问这个问题怎么解决 我和楼主问题一样 也是用的python3.7

loss_char = sum([criterion_char(restored[j],target) for j in range(len(restored))]) 将 np.sum改为sum就可以了

wpc0086 avatar Nov 25 '22 02:11 wpc0086

请问大家训练过程中遇到了损失函数突然很大的情况吗?我训练到20轮的时候损失函数就爆炸了

drifterss avatar Mar 27 '23 07:03 drifterss

请问大家训练过程中遇到了损失函数突然很大的情况吗?我训练到20轮的时候损失函数就爆炸了

老哥问题解决了吗

Feecuin avatar Apr 16 '24 08:04 Feecuin