Volodymyr Arbatov

Results 2 comments of Volodymyr Arbatov

Assuming only one type can run at at time MFU can be defined as: `MFU = 100 * sum_i(model_flops[i] / peak_flops[i]) / time;` Where `i` iterates over data types and...

@tianyu-l , it doesn't add percentages together, it computes one single utilization percentage against the ideal 100% utilization across all units without measuring latency of individual data types. Utilization [50%...