tuning_playbook icon indicating copy to clipboard operation
tuning_playbook copied to clipboard

`training throughput` may not equal to `time per step`

Open SimLif opened this issue 2 years ago • 1 comments

One of the sections mentions that training throughput is equivalent to time per step. There is a doubt here. Suppose there are two kinds of batch size: 64 and 128, then training throughput does not have the same value when time per step is both 1. And obviously, training throughput is a better reflection of batch size.

SimLif avatar Jan 26 '23 14:01 SimLif

They are equivalent by definition:

training throughput = (# examples processed per second)

The right hand side is equal to batch size / time per step. Rearranging this equation gives:

time per step = (batch size) / (training throughput)

I think the confusion is that "equivalent" does not mean "equal" in this context. Rather, given a particular batch size, knowing the throughput is equivalent information to knowing the step time, in the sense that knowing one allows you to compute the other.

jondeuce avatar Jan 26 '23 19:01 jondeuce

Hey @SimLif, you're right that your language might be more precise here. In general looking at training throughput is one of the more useful metrics when choosing a batch size.

varungodbole avatar Mar 21 '24 21:03 varungodbole