mesh icon indicating copy to clipboard operation
mesh copied to clipboard

Does load-balanced loss help the loss converge?

Open mathfinder opened this issue 1 year ago • 0 comments

https://github.com/tensorflow/mesh/blob/fbf7b1e547e8b8cb134e81e1cd350c312c0b5a16/mesh_tensorflow/transformer/moe.py#L935

I try load-balanced loss in my project and find load-balanced loss does not help loss converge.

Does it only balance the load, but does not help the loss convergence, or even slightly hurt the model?

mathfinder avatar Jun 13 '23 02:06 mathfinder