dni-pytorch icon indicating copy to clipboard operation
dni-pytorch copied to clipboard

BasicSynthesizer last layer bias not zero

Open djd1283 opened this issue 5 years ago • 0 comments

zero-initialize the last layer, as in the paper

    if n_hidden > 0:
        init.constant(self.layers[-1].weight, 0)
    else:
        init.constant(self.input_trigger.weight, 0)
        if context_dim is not None:
            init.constant(self.input_context.weight, 0)

The BasicSynthesizer class sets all weights to zero for the final layer of the DNI. However, this does not set the biases to zero. From the paper:

"The final regression layer of all synthetic gradient models are initialised with zero weights and biases, so initially, zero synthetic gradient is produced."

In my experiments I observed the initial DNI gradient was not zero, but was equal to the bias term.

djd1283 avatar May 21 '19 17:05 djd1283