Semantic-Segmentation-Suite icon indicating copy to clipboard operation
Semantic-Segmentation-Suite copied to clipboard

cant load checkpoint files

Open skywo1f opened this issue 5 years ago • 10 comments

tried all of the files in the checkpoints folder: model.ckpt.index model.ckpt.meta checkpoint model.ckpt.data-00000-of-00001 none of them work: Semantic-Segmentation-Suite/checkpoints/0295/model.ckpt.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator

skywo1f avatar Oct 14 '19 10:10 skywo1f

Have you solve this problem?

sweetboxwwy avatar Nov 14 '19 12:11 sweetboxwwy

@skywo1f @GeorgeSeif same problem. Anyone suggest me please. I trained on my PC , training is Ok but can not load the checkpoint file

K-M-Ibrahim-Khalilullah avatar Nov 15 '19 01:11 K-M-Ibrahim-Khalilullah

same issue -

NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key BatchNorm/beta not found in checkpoint [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]] [[Node: save/RestoreV2/_79 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_84_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]

millermuttu avatar Nov 26 '19 06:11 millermuttu

Is the problem solved?

FZY2019 avatar Dec 22 '19 06:12 FZY2019

no

skywo1f avatar Dec 24 '19 16:12 skywo1f

Had the same issue - in your case, you should just pass model.ckpt https://stackoverflow.com/questions/33759623/tensorflow-how-to-save-restore-a-model Also if you're not using CamVid, make sure to pass something to --dataset, otherwise it will default to the 32 class labels from the CamVid dataset. Hope that helps.

Imperssonator avatar Jan 01 '20 22:01 Imperssonator

you better try this disk:/your_folder/model.ckpt

wy9884255 avatar Mar 05 '20 08:03 wy9884255

Is this problem solved

Harikrishnan24 avatar Mar 11 '20 08:03 Harikrishnan24

I think I have the solution...just use model.ckpt even if no such file exists.

I had the same struggle in trying to pass model.ckpt.meta etc in for resuming the train. Even though no file name exists in the directory where the checkpoint was specified to be saved, I just used model.ckpt and it worked out.

mtylerpreston avatar May 15 '20 17:05 mtylerpreston

Thank you for your help

On Fri, May 15, 2020, 10:59 PM M. Tyler Preston [email protected] wrote:

I had the same struggle. Even though no file name exists in the directory where the checkpoint was specified to be saved, I just used model.ckpt and it worked out.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/GeorgeSeif/Semantic-Segmentation-Suite/issues/225#issuecomment-629385402, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEVXEBDB5VG3K656S435TWTRRV3YFANCNFSM4JANDEMQ .

Harikrishnan24 avatar May 17 '20 12:05 Harikrishnan24