DCGAN-tensorflow icon indicating copy to clipboard operation
DCGAN-tensorflow copied to clipboard

To continue my train.

Open john-rocky opened this issue 6 years ago • 4 comments

I want to let my model read my checkpoint and continue my train. How can I let my model read the checkpoint ?

john-rocky avatar Feb 16 '18 01:02 john-rocky

Normally if the checkpoint is in the folder 'checkpoint', it will be picked up automatically when you resume training. But you need to use the same parameters (batch_size, image size, etc.)

It will create the first checkpoint after 500 iterations or something.

benckx avatar Mar 10 '18 19:03 benckx

@john-rocky normally it works like this and @benckx is correct but i've had one confusion recently where epoch starts from beginning even after loading checkpoints which i think is not normal however i must admint that i've_ not long experience in reusing checkpoints.

hi0001234d avatar Sep 05 '18 07:09 hi0001234d

@john-rocky normally it works like this and @benckx is correct but i've had one confusion recently where epoch starts from beginning even after loading checkpoints which i think is not normal however i must admint that i've_ not long experience in reusing checkpoints.

Could anyone please verify if the described behavior is how “continue training” should work?

My stdout looks e.g. like this:

 [*] Reading checkpoints...
 [*] Success to read DCGAN.model-2
 [*] Load SUCCESS
Epoch: [ 0/ 1] [   0/ 220] time: 14.7326, d_loss: 623.09069824, g_loss: 0.00000000

So it starts countint again from 0 …

fooness avatar Dec 04 '18 19:12 fooness

@john-rocky normally it works like this and @benckx is correct but i've had one confusion recently where epoch starts from beginning even after loading checkpoints which i think is not normal however i must admint that i've_ not long experience in reusing checkpoints.

Could anyone please verify if the described behavior is how “continue training” should work?

My stdout looks e.g. like this:

 [*] Reading checkpoints...
 [*] Success to read DCGAN.model-2
 [*] Load SUCCESS
Epoch: [ 0/ 1] [   0/ 220] time: 14.7326, d_loss: 623.09069824, g_loss: 0.00000000

So it starts countint again from 0 …

I met the same problem. image Is it normal?

madmannnn avatar May 01 '21 16:05 madmannnn