traiNNer
traiNNer copied to clipboard
GPU usage at 0% during training
So i see it is taking up VRAM but im not seeing in windows 10 that it is using the GPU I am using a 4090 cuda V11.8.89 pytorch 1.12.1 python 3.9

Hello! Can you share your configuration file? And if possible, the training log file as well
Hello! Can you share your configuration file? And if possible, the training log file as well
config: https://pastebin.com/NvutbgaN
training log: https://pastebin.com/dvdDqgtY
ok this may have actually been just been an issue with windows
running training script from on xinntao's repo for real-ESRGAN did similar things
and using a third party monitor showed diffrent results from windows.

just to assure im not getting a bottleneck somewhere though, typically how fast would training be if gpu is working correctly on data that is 512 by 512?
You need to monitor CUDA in task manager, rather than 3D. Do this by clicking where it says 3D (the text) and selecting CUDA from the dropdown. If it's not there, disable Hardware-Accelerated GPU Scheduling in Windows settings.

@nub2927 sorry for late reply, but the configuration and logs look correct! As Kim mentions, only the CUDA pipeline is used when using PyTorch models with the GPU.
I don't have the numbers from a 4090, but training speed is fast from what I can see in the logs. Any bottleneck you may find could require changing the code to optimize the parts that are handled on CPU (like the images pipeline, etc), but at least it looks good on the GPU side.
I am having a problem just starting.