fastmoe icon indicating copy to clipboard operation
fastmoe copied to clipboard

how to use balance loss?

Open Heihaierr opened this issue 1 year ago • 1 comments

how to apply balance loss? can u add it to the example 'transformer-xl'?

Heihaierr avatar Oct 26 '23 07:10 Heihaierr

Sorry for the late reply.

The BaseGate module has methods including set_loss, get_loss and has_loss. In a customized gate (or gates in FastMoE with balance losses), they use self.set_loss to put the loss value in the module, which can be further added to the final loss using get_loss function of the gate modules. (e.g. adding them to get_loss function in Megatron-LM)

We will add this to our document.

laekov avatar Nov 06 '23 06:11 laekov