InternEvo
InternEvo copied to clipboard
add use_fp32_logits flag
trafficstars
use bf16 logits for loss :
loss = dict(
label_smoothing=0, op_type='flash_vocab_parallel'
)
use_fp32_logits = False
by default use_fp32_logits is True, no BC-break.