Grzegorz George Pawelczak comments

Results 6 comments of


                                            Grzegorz George Pawelczak

Feature request: optimizer support with slot variables in different dtype to the variable.

One could write an optimizer (for example Adam) for a model which has the weights and gradients in fp16, but the slot variables might have to be in higher precision...

Feature request: optimizer support with slot variables in different dtype to the variable.

Any thoughts on the proposed solution?

Gradient accumulation support?

Hey, I just wanted to throw in some personal experience with working on gradient accumulation in TF/Keras at Graphcore for IPUs. 1. Batch Norm - for the MLPerf submission distributed...

Gradient accumulation support?

Hey, I just wanted to throw in some personal experience with working on gradient accumulation in TF/Keras at Graphcore for IPUs. 1. Batch Norm - for the MLPerf submission distributed...

Feature request: optimizer support with slot variables in different dtype to the variable.

One could write an optimizer (for example Adam) for a model which has the weights and gradients in fp16, but the slot variables might have to be in higher precision...

Feature request: optimizer support with slot variables in different dtype to the variable.

Any thoughts on the proposed solution?