tutorials
tutorials copied to clipboard
Can not show the "GPU Summary" panel in the TensorBoard with torch_tb_profiler plugin
I do the PYTORCH PROFILER WITH TENSORBOARD tutorial to view the training details with NVIDIA GPU and CUDA.
I got the final profiled log result after running the same code in the tutorial, however, I cannot get see the "GPU summary" in the tensorboard page, the page I see like this:
Tensorboad log:
root@localhost:/workspace# tensorboard --logdir=./log --host 0.0.0.0
TensorFlow installation not found - running with reduced feature set.
NOTE: Using experimental fast data loading logic. To disable, pass
"--load_fast=false" and report issues on GitHub. More details:
https://github.com/tensorflow/tensorboard/issues/4784
I1110 09:26:32.496620 140708883191552 plugin.py:429] Monitor runs begin
I1110 09:26:32.497677 140708883191552 plugin.py:444] Find run directory /workspace/log/resnet18
I1110 09:26:32.498377 140708866406144 plugin.py:493] Load run resnet18
I1110 09:26:32.516169 140708866406144 loader.py:57] started all processing
TensorBoard 2.10.0 at http://0.0.0.0:6006/ (Press CTRL+C to quit)
W1110 09:26:36.232970 140708363106048 security_validator.py:46] In 3.0, this warning will become an error:
Requires default-src for Content-Security-Policy
I1110 09:26:36.509156 140708866406144 plugin.py:497] Run resnet18 loaded
I1110 09:26:36.509511 140708874798848 plugin.py:467] Add run resnet18
cc @aaronenyeshi @chaekit @sekyondaMeta @svekars @carljparker @NicolasHug @kit1980 @subramen @robieta
Hi,
I kind of solved this issue using this patch: https://github.com/pytorch/kineto/pull/674/commits/97b52f1ff3ab27b52340f73415cda660fc291b83
But lots of information are still missing! For instance Dataloader part is not properly reported and we can visualize only one step...
This is an issue for me with torch-1.12.1/tensorboard-2.10.0 and torch-1.13/tensorboard-2.11.0. Everything is working fine with torch-1.11/tensorboard-2.8.0
The issue remains with torch-2.0.0 / tensorboard 2.12.0. Any news?
Same issue, any idea?
The "GPU Summary" is visible when running on Google Colab with torch==2.0.1+cu118
and tensorboard==2.12.2
Thanks! Can you also see several steps (in the "step time breakdown" graph) and is DataLoader usage non zero? (refering to this issue open https://github.com/pytorch/kineto/issues/610)
That is still 0
/assigntome
This issue has been unassigned due to inactivity. If you are still planning to work on this, you can still send a PR referencing this issue.
/assigntome
This issue has been unassigned due to inactivity. If you are working on this issue, assign it to yourself and send a PR ASAP.
/assigntome
Was this issue ever resolved ? I am struggling with the same problem.