Lingchen Sun

The Hong Kong Polytechnic University

Results 11 comments of


                                            Lingchen Sun

why use one-step sampling

你好，感谢你对我们的工作感兴趣！推理过程中 x_T 是纯噪声。x_T 到 x_max 这一步是为了直接从 LR 图片中提取信息。因为复原和生成不同，复原中的 LR 中已经包含了很多信息，不需要再像生成任务一样每个 pixel 都'从 0 生成'。因此提取完信息以后，类比于已经达到了 diffusion process 的中间步骤。因此不需要再 x_T 逐步去噪。这样做的好处是：随机性减少了，质量更加稳定，并且还可以加速。 'LR加对应噪声得到x_tmax' 这个想法很有意思，我们目前没有做过相关的验证实验。但是我觉得，这种方法可能受 LR 退化程度影响更大。如果你有有意思的发现，欢迎进一步交流讨论！

Question About Figure.1

你好！感谢你对我们的发现感兴趣。在 model/q_sampler 中有个 _predict_xstart_from_eps 函数。输入带噪的 xt 和估计出来的噪声，可以获得对应的估计出来的无噪的 x0。我们在Figure.1 的测试都是基于每个步数估计出来的无噪的 x0 做的。

ModuleNotFoundError: No module named 'utils.devices'

你好！我已经重新修改了代码，请再次尝试一下看看问题有没有得到解决。感谢你的指正！

Xt_min -> X0

你好，非常感谢你对我们工作的认可。这个过程是先从 Xt_min 估计 t_min 时刻的噪声，再通过估计的噪声和 Xt_min 来估计 t_min 时刻的 X0。最后，直接输出t_min 时刻的 X0作为 diffusion 的最终结果，将该结果送入 VAE decoder 做进一步的增强。

Xt_min -> X0

你好！我们后面的消融实验发现：更少的步数，以及不同的 t_max 和 t_min 的组合会带来不同的超分效果。这个部分还是有一些可以挖掘的地方，欢迎一起交流讨论。我们在 Table2 给出的是所有模型的参数量。StableSR 的 Table4 是可训练模型的参数量。因为在训练的时候有相当一部分的模型参数都是冻住的，所以这两个参数量相差比较大。

librairy pytorch_lightning.utilities.distributed problem

Hi, the problem seems to be caused by the version of pytorch_lightning. My environment information: pytorch-lightning Version: 1.4.2, torch Version: 2.0.1+cu118, Python Version: 3.10.10. You can re-install the corresponding version...

librairy pytorch_lightning.utilities.distributed problem

The versions of torchmetrics and torchvision are 0.6.0 and 0.15.2+cu118. You can try this setting.

Nothing in outputs for input 640x480, after long time... 24GB VRAM

Hi, thanks for your interest in our work! In our inference code of 'inference_ccsr.py', the input is resized to 512 by default. Under the default settings, the size of the...

Nothing in outputs for input 640x480, after long time... 24GB VRAM

You can use the tiling function by modifying the 'tiled' from 'store_true' to 'store_false' in 'inference_ccsr.py'. You can also choose appropriate 'tile_size' and 'tile_stride' according to your GPU memory and...

Line at bottom

Hello. The padding strategy is used in 'inference_ccsr.py' to make the input LR image size be a multiple of 64, which may be the reason of this problem. We have...

1
2
›