MZSR
MZSR copied to clipboard
When I ran large-scale training code, I have some problems. Could you help me?
(base) wit@wit:/media/wit/Data1/WH/MZSR-master-new/Large-Scale_Training$ python3 main.py --gpu 0 --trial 2 --step 0 Initialize Training Build Model MODEL Initialize weights MODEL Setting Train Configuration Model Params: 225 K ========== Reading Checkpoints ============ ============= Failed to find a checkpoint ============= ========== No model to load =========== Training Starts Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1278, in _do_call return fn(*args) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1263, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1350, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: Feature: image (data type: string) is required but could not be found. [[Node: ParseSingleExample/ParseSingleExample = ParseSingleExample[Tdense=[DT_STRING, DT_STRING], dense_keys=["image", "label"], dense_shapes=[[], []], num_sparse=0, sparse_keys=[], sparse_types=[]](arg0, ParseSingleExample/Const, ParseSingleExample/Const)]] [[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,64,64,3], [?,16,16,3]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 49, in
Dear Sir, Amazing work ! Congratulation!! please , I have a question.can you kindly provide me with the full path I should insert of checkpoint the trained large scale training model to be able to use it as a pre-trained to meta transfer training? I'm waiting for your reply. Thanks in advance
Please @wh2333 I'm facing a problem when i load the pretrained model , specially when it reads the checkpoint this is the error .. how did you kindly solve it please ??
NotFoundError (see above for traceback): Key MODEL/conv7/kernel/Adam_3 not found in checkpoint [[Node: save/RestoreV2_69 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_69/tensor_names, save/RestoreV2_69/shape_and_slices)]]
Please can you kindly explain me how to calculate this weight loss ?
def get_loss_weights(self): loss_weights = tf.ones(shape=[self.TASK_ITER]) * (1.0/self.TASK_ITER) decay_rate = 1.0 / self.TASK_ITER / (10000 / 3) min_value= 0.03 / self.TASK_ITER
loss_weights_pre = tf.maximum(loss_weights[:-1] - (tf.multiply(tf.to_float(self.global_step), decay_rate)), min_value)
loss_weight_cur= tf.minimum(loss_weights[-1] + (tf.multiply(tf.to_float(self.global_step),(self.TASK_ITER- 1) * decay_rate)), 1.0 - ((self.TASK_ITER - 1) * min_value))
loss_weights = tf.concat([[loss_weights_pre], [[loss_weight_cur]]], axis=1)
return loss_weights
Please @wh2333 I'm facing a problem when i load the pretrained model , specially when it reads the checkpoint this is the error .. how did you kindly solve it please ??
NotFoundError (see above for traceback): Key MODEL/conv7/kernel/Adam_3 not found in checkpoint [[Node: save/RestoreV2_69 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_69/tensor_names, save/RestoreV2_69/shape_and_slices)]]
Hi, I met the same problem, and really want to know what you do to fix it