HugeCTR
HugeCTR copied to clipboard
[Requirement]Profiling operations for HugeCTR
Hi HugeCTR team,
Recently I have used Nsight to profile a model which uses HugeCTR.
Unlike another tool, DLProf, which gives an operation break down for the model, I found the result from Nsight is very low level and it is quite difficult to find out what's the total time of each operation is.
I am wondering is there a way to get a high level operation profile for HugeCTR model?
Hi @regnnighe , so far, DLProf doesn't support HugeCTR and there's no specific high level profiling tool for HugeCTR. But I think it's a good requirement to support better profiling mechanisms. Would you like to elaborate what are the high level operation profile you are looking for? Do you think DLProf is enough to support your usage?
@zehuanw Thank you for your reply! For example, while I am using HugeCTR for the DRLM model, it would be nice to have a profiler which can give the operation time for different parts of the model, such as embeddings, bottom mlp, top mlp, and etc.
For DLProf, it works well for a model using Pytorch since it shows the operation label so that I know how much time are corresponding to each part of the model.
Thank you for the feedback! Relabel it as a functional feature requirement. We will track the planning and development in this issue.