Pavel Izmailov
Results
2
issues of
Pavel Izmailov
t3f.ops.multiply doesn't support unknown shapes as it checks that outputs of get_shape() coincide.
bug
Hi! I noticed that in your code for BERT AdamW optimizer you only apply weight decay to parameters that contain the strings `bias` or `LayerNorm.weight`: https://github.com/facebookresearch/BalancingGroups/blob/72d31e56e168b8ab03348810d4c5bac0f8a90a7a/models.py#L41-L45 The original group DRO...