tungdq212
tungdq212
While training with multi gpus, sometimes 1 gpu sometime use 100% its ult while others is waiting with 0%.  Here is my training command ``` #!/bin/bash CUDA_VISIBLE_DEVICES='0,1,2' \ torchrun...
Thank you for excllent work. > Detection models now can be exported to TRT engine with batch size > 1 - **inference code doesn't support it yet**, though now they...
Hi, When i build a basic streamlit app like you, click a button GENERATE to start generate images. But when click this button multiple times, a numbers of processes will...
## Bug when local training with LocalDataset Here is my config (without some personal paths), run for mosaicml's diffusion: ``` algorithms: low_precision_groupnorm: attribute: unet precision: amp_fp16 low_precision_layernorm: attribute: unet precision:...
When training on my local machine (3090 24Gb) with batch size 12, grad value become NaN after few steps But I don't meet this when training on Google Cloud A100...