Weight-Normalization what's the role param init play?

Hi I wanna know what the param init do in your code

and why you create 3 models in the same train script I was confused

Nov 25 '19 09:11 MachineJeff

Hello In the templates the weights and all the parameters are defined with "tf.get_variable(...)". In the model template these weights are defined. So the initialization, training, and testing phases share the same weights. Before training phase the parameters are initialized with the initialization call on the model template, so the parameters has been initialized with a feed forward step. After the parameters are being initialized, the training phase is called with the initialized parameters. Templates can handle the situtation when calling it multiple times. If the parameters were initialized before calling the model template once again, the second time the template will reuse the initialized variables.

Nov 25 '19 19:11 zoli333

Right, I get your point.

But, why not remove the param init and just create one model, like this:

init_forward = model(x_init,keep_prob=0.5,deterministic=is_training,
                        use_weight_normalization=use_weight_normalization,
                        use_batch_normalization=use_batch_normalization, 
                        use_mean_only_batch_normalization=use_mean_only_batch_normalization)

Use variable is_training to distinguish train and test, then just apply train, test in the same init_forward model?

Nov 26 '19 02:11 MachineJeff

Besides, really appreciate your weight_norm code. I do not like the param init So I rewrite the weight_norm code into this:

def wn_conv1d(x, kernel_size, channels, scope, stride=1, pad='SAME', dilation=1, nonlinearity=None, init_scale=1.):

    xs = int_shape(x)
    filter_size = [1, kernel_size]
    dila = [1, dilation]
    strs = [1, stride]
    with tf.variable_scope(scope):
        # data based initialization of parameters
        V = tf.get_variable('V', filter_size+[xs[-1],channels], tf.float32, tf.random_normal_initializer(0, 0.05), trainable=True)
        V_norm = tf.nn.l2_normalize(V.initialized_value(), [0,1,2])
        x_init = tf.nn.conv2d(x, V_norm, [1]+strs+[1], pad, dilations=dila)
        m_init, v_init = tf.nn.moments(x_init, [0,1,2])
        scale_init = init_scale/tf.sqrt(v_init + 1e-8)
        x_init = tf.reshape(scale_init,[1,1,1,channels])*(x_init-tf.reshape(m_init,[1,1,1,channels]))
        if nonlinearity is not None:
            x_init = nonlinearity(x_init)
        return x_init

That's not conv2d, but conv1d （It does not matter）

Any problem in my code do you think?

Nov 26 '19 02:11 MachineJeff