MZSR icon indicating copy to clipboard operation
MZSR copied to clipboard

Error during Large Scale Training

Open Kavit212 opened this issue 4 years ago • 1 comments

Hi, thank you for sharing your wonderful work. When running large scale training, I encountered this error

Traceback (most recent call last): File "main.py", line 49, in main() File "main.py", line 45, in main scale=SCALE,num_of_data=NUM_OF_DATA, conf=conf) TypeError: init() missing 1 required positional argument: 'model_num'

It will be very helpful for me if you could help me out with this.

Thank you

Kavit212 avatar Dec 15 '20 14:12 Kavit212

Please can you kindly explain me how to calculate this weight loss ?

def get_loss_weights(self): loss_weights = tf.ones(shape=[self.TASK_ITER]) * (1.0/self.TASK_ITER) decay_rate = 1.0 / self.TASK_ITER / (10000 / 3) min_value= 0.03 / self.TASK_ITER

    loss_weights_pre = tf.maximum(loss_weights[:-1] - (tf.multiply(tf.to_float(self.global_step), decay_rate)), min_value)

    loss_weight_cur= tf.minimum(loss_weights[-1] + (tf.multiply(tf.to_float(self.global_step),(self.TASK_ITER- 1) * decay_rate)), 1.0 - ((self.TASK_ITER - 1) * min_value))
    loss_weights = tf.concat([[loss_weights_pre], [[loss_weight_cur]]], axis=1)
    return loss_weights

BassantTolba1234 avatar Jan 06 '21 10:01 BassantTolba1234