Results 11 comments of HYUN Jeongseok
trafficstars

> Maybe the slowfast mode also used in the 72B model's training stage, instead of only for 72B model's inference stage? Based on the config.json of 72B model, it seems...