TTS
TTS copied to clipboard
[Feature request] VRAM calculated + precaution warning message
Calculate the amount of vram[GPU memory] will be used in the first stage of trainer.fit
Based on the batch size, dataset, and other variables. Sometimes it takes hours to find out the full config best the the current hardware, like what happened after i left it training overnight, after 35000 steps it run out of ram without any warning or saving the model before crashing leading to over $40 wasted.
Solution
Give a warning when the current config might overflow the available vram and is going to throw a error, which would prevented wasting costly $$$ on idle servers during midnight. Or, preallocate all the vram going to be used by the AI.
TABLE
If anyone have a good table on batch size vs vram please send it here.
There is a on going work here but it is not easy at all
Calculate the amount of vram[GPU memory] will be used in the first stage of trainer.fit
Based on the batch size, dataset, and other variables. Sometimes it takes hours to find out the full config best the the current hardware, like what happened after i left it training overnight, after 35000 steps it run out of ram without any warning or saving the model before crashing leading to over $40 wasted.
Solution
Give a warning when the current config might overflow the available vram and is going to throw a error, which would prevented wasting costly $$$ on idle servers during midnight. Or, preallocate all the vram going to be used by the AI.
TABLE
If anyone have a good table on batch size vs vram please send it here.
there is a branch I'm working on in our trainer repo that will auto calculate the max batch size for you so you dont have to worry about it here: https://github.com/coqui-ai/Trainer/tree/largest_batch_size_finder currently it only just does training finding the max batch size possible but giving a warning for people trying to use different batch sizes is a good idea
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.