Unit in (model-only) latency

Open fengyangyang98 opened this issue 2 years ago • 0 comments

mtimes is multiple by 1000 to get the time in the unit of ms in print_latency, but it is already in the unit of ms.

use_cuda_events is true by default in function profile_model_time.

From https://pytorch.org/docs/stable/generated/torch.cuda.Event.html,

Returns the time elapsed in milliseconds after the event was recorded and before the end_event was recorded.

Apr 18 '23 07:04 fengyangyang98