Pavel Izmailov

Results 2 issues of Pavel Izmailov

t3f.ops.multiply doesn't support unknown shapes as it checks that outputs of get_shape() coincide.

bug

Hi! I noticed that in your code for BERT AdamW optimizer you only apply weight decay to parameters that contain the strings `bias` or `LayerNorm.weight`: https://github.com/facebookresearch/BalancingGroups/blob/72d31e56e168b8ab03348810d4c5bac0f8a90a7a/models.py#L41-L45 The original group DRO...