imagen-pytorch icon indicating copy to clipboard operation
imagen-pytorch copied to clipboard

Loss oscillation

Open FTKyaoyuan opened this issue 2 years ago • 34 comments

hi My loss looks strange. Is it normal? image

FTKyaoyuan avatar Oct 20 '22 01:10 FTKyaoyuan

I got a similar loss curve. Did you observe any visual improvement in your synthesized image?

Feanor007 avatar Oct 20 '22 09:10 Feanor007

I got a similar loss curve. Did you observe any visual improvement in your synthesized image?

I'm training my first unet so I'm also sampling for I will start training my second unet tomorrow

FTKyaoyuan avatar Oct 20 '22 09:10 FTKyaoyuan

I got a similar loss curve. Did you observe any visual improvement in your synthesized image?

What did your training result look like?

FTKyaoyuan avatar Oct 20 '22 09:10 FTKyaoyuan

I trained my model to synthesize gray scale images (channels = 1). This is what it looks like after 5000 steps: Screen Shot 2022-10-20 at 10 21 58 This is what it looks like now (12000 steps): Screen Shot 2022-10-20 at 10 23 12

You can see there are improvements, but still not perfect for a thoracic CT scan.

Feanor007 avatar Oct 20 '22 09:10 Feanor007

Also, the training was really slow to me. It took more than 24 hours to reach 10000 steps. Is this same for your?

Feanor007 avatar Oct 20 '22 09:10 Feanor007

Also, the training was really slow to me. It took more than 24 hours to reach 10000 steps. Is this same for your?

image I trained about 450k setp for 24 hours

FTKyaoyuan avatar Oct 20 '22 09:10 FTKyaoyuan

what does your result look like?

Feanor007 avatar Oct 20 '22 09:10 Feanor007

I don't know yet because I only trained unet1 I have two Unet I want to view the results after training two unets

FTKyaoyuan avatar Oct 20 '22 09:10 FTKyaoyuan

tomorrow i will train unet2

FTKyaoyuan avatar Oct 20 '22 09:10 FTKyaoyuan

The GPU I am using is 3080 And yours?

FTKyaoyuan avatar Oct 20 '22 09:10 FTKyaoyuan

The GPU I am using is 3080 And yours?

Nvidia Titan X. Looking forward to your results!

Feanor007 avatar Oct 20 '22 09:10 Feanor007

The GPU I am using is 3080 And yours?

Nvidia Titan X. Looking forward to your results!

you Unet number is ??

FTKyaoyuan avatar Oct 20 '22 09:10 FTKyaoyuan

The GPU I am using is 3080 And yours?

Nvidia Titan X. Looking forward to your results!

you Unet number is ??

Just one U-Net, the image size is 64 * 64 * 32. Will do the superresolution later

Feanor007 avatar Oct 20 '22 09:10 Feanor007

ok If there is a result, I will let you know as soon as possible Your results look like they should be fine if you keep training. Try training for another 24 hours You can try changing lr to 1e-5

FTKyaoyuan avatar Oct 20 '22 09:10 FTKyaoyuan

ok If there is a result, I will let you know as soon as possible Your results look like they should be fine if you keep training. Try training for another 24 hours You can try changing lr to 1e-5

Thanks. If you don't mind, can you take look at issue #245 ? Perhaps there is something wrong with the setup that makes my training so slow?

Feanor007 avatar Oct 20 '22 13:10 Feanor007

image hi

好的,如果有结果,我会尽快通知您,如果您继续训练,您的结果看起来应该没问题。尝试再训练 24 小时 您可以尝试将 lr 更改为 1e-5

谢谢。如果你不介意,你能看看问题#245吗?也许设置有问题,使我的训练如此缓慢?

This is my image. It looks like it has an image, but I think it has nothing to do with my text, and the resolution is somewhat low. I can't even see the content clearly

FTKyaoyuan avatar Oct 21 '22 05:10 FTKyaoyuan

image image

FTKyaoyuan avatar Oct 21 '22 05:10 FTKyaoyuan

image

FTKyaoyuan avatar Oct 21 '22 05:10 FTKyaoyuan

Maybe train the U-Nets separately? Easier for debugging

Feanor007 avatar Oct 21 '22 10:10 Feanor007

image hi

好的,如果有结果,我会尽快通知您,如果您继续训练,您的结果看起来应该没问题。尝试再训练 24 小时 您可以尝试将 lr 更改为 1e-5

谢谢。如果你不介意,你能看看问题#245吗?也许设置有问题,使我的训练如此缓慢?

This is my image. It looks like it has an image, but I think it has nothing to do with my text, and the resolution is somewhat low. I can't even see the content clearly

By the way, how did you sample using two U-Nets without getting OOM error?

Feanor007 avatar Oct 21 '22 13:10 Feanor007

也许单独训练U-Nets?更易于调试

I still don't quite understand the purpose of the two unets to achieve different resolutions. For example, 6464 uses the first one, and 128128 uses the second one?

Maybe train the U-Nets separately? Easier for debugging

I still don't quite understand the purpose of the two unets to achieve different resolutions. For example, 6464 uses the first one, and 128128 uses the second one?

FTKyaoyuan avatar Oct 24 '22 02:10 FTKyaoyuan

image hi

好的,如果有结果,我会尽快通知您,如果您继续训练,您的结果看起来应该没问题。尝试再训练 24 小时 您可以尝试将 lr 更改为 1e-5

谢谢。如果你不介意,你能看看问题#245吗?也许设置有问题,使我的训练如此缓慢?

This is my image. It looks like it has an image, but I think it has nothing to do with my text, and the resolution is somewhat low. I can't even see the content clearly

By the way, how did you sample using two U-Nets without getting OOM error?

I didn't specify unet when I sampled, no error was reported

FTKyaoyuan avatar Oct 24 '22 02:10 FTKyaoyuan

Hello,Do you have a model trained now? I would like to ask you about the specific value of the Unet parameter. The results I get so far are not good.The images I've gotten so far are rather blurry.

QinSY123 avatar Oct 27 '22 09:10 QinSY123

And how many epochs have you trained, which dataset are you using

QinSY123 avatar Oct 27 '22 09:10 QinSY123

Hello,Do you have a model trained now? I would like to ask you about the specific value of the Unet parameter. The results I get so far are not good.The images I've gotten so far are rather blurry.

I used two unet datasets for mscoco but I feel that my resolution is relatively small, so the generated effect does not look good

FTKyaoyuan avatar Oct 27 '22 09:10 FTKyaoyuan

And how many epochs have you trained, which dataset are you using

5 epoch

FTKyaoyuan avatar Oct 27 '22 09:10 FTKyaoyuan

And how many epochs have you trained, which dataset are you using

can i see your results?

FTKyaoyuan avatar Oct 27 '22 09:10 FTKyaoyuan

1 text:a big building with a clock at the top of it 1666856724998 @FTKyaoyuan

QinSY123 avatar Oct 27 '22 11:10 QinSY123

I trained the model on the COCO2017 dataset with a total of 150,000 data pairs.

QinSY123 avatar Oct 27 '22 11:10 QinSY123

1 text:a big building with a clock at the top of it 1666856724998 @FTKyaoyuan

Maybe you should fill and then resize otherwise the graphics will be distorted

FTKyaoyuan avatar Oct 28 '22 01:10 FTKyaoyuan