context-encoder
context-encoder copied to clipboard
How to resume training from a certain epoch value ?
Hi ! Your work is great but I want to know that if i want to resume training from certain epoch then how to edit this in your training center inpaint code ?
When I try to load the pre-trained network and then want to update it with more training it gives me following error.......... any help @pathak22 ?? /home/maryam/torch/install/bin/lua: /home/maryam/torch/install/share/lua/5.2/nn/Module.lua:327: check that you are sharing parameters and gradParameters stack traceback: [C]: in function 'assert' /home/maryam/torch/install/share/lua/5.2/nn/Module.lua:327: in function 'getParameters' train.lua:270: in main chunk [C]: in function 'dofile' ...ryam/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: in ?
@pathak22 kindly help me
@maryam089 Can you paste the full log? Where is this pre-trained network from? Was it trained from the context encoder training code? Are the architecture and other details same?
Well i am using your already trained network on imageNet 100k ..(center region inpaint).... but now i want to add few thousand more images to it and train it again ... how i can i load the weights and parameters to the network when i try to train it again for 1 or 2 more epoch on by loading already trained network @pathak22
@pathak22 if i stopped traing for any reason can i resume tge training from the epoch it stopped in as it always start from the beginning
@NerminSalem @maryam089
Sorry I didn't provide this functionality in the training code (I should have!). But it should not be hard to implement if you look at this file and see how the network is first loaded. After this, the loaded network is same as the one defined here, and hence you won't need to define it again. Feel free to make a pull request if you would like. Thanks!
@pathak22
Hi,
This is regrading re-training the imagenet/paris model that you have shared.
We referred the below two links:
https://github.com/torch/demos/blob/master/train-a-digit-classifier/train-on-mnist.lua
https://github.com/facebook/fb.resnet.torch/issues/116
And we understand that there is a command line argument to indicate whether its re-training or training from beginning.
Do you have any such command line argument to be passed to indicate regarding retraining in your code?
Can you please suggest me the code changes that could be done?