returnn Introduce native assign_mul for more efficient weight decay

Introduce native assign_mul for more efficient weight decay

Open albertz opened this issue 2 years ago • 1 comments

This becomes relevant for efficient decoupled weight decay implementation. If it is not decoupled, it's inefficient anyway.

Nov 08 '22 13:11 albertz

Note, for relevant code: https://github.com/tensorflow/tensorflow/blob/9959f963a0afe0a5a24cb9913998fe89169df252/tensorflow/core/kernels/resource_variable_ops.cc#L630 https://github.com/tensorflow/tensorflow/blob/27dc409fcfcc538cce7447b9637a8f727ef6a123/tensorflow/core/ops/resource_variable_ops.cc#L217

Note that our assign_mul should support broadcasting, or rather even supporting a scalar as argument. This is actually the only relevant use case for us, so it's ok if we only implement that.

Nov 08 '22 13:11 albertz

returnn returnn copied to clipboard

Introduce native assign_mul for more efficient weight decay

returnn
returnn copied to clipboard