diffusion icon indicating copy to clipboard operation
diffusion copied to clipboard

Results 22 diffusion issues
Sort by recently updated
recently updated
newest added

frustrated after training about 1654/ba it corrupted, failed to save the checkpoint, tried two times. Error as follows: > [E ProcessGroupNCCL.cpp:828] [Rank 3] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=39739, OpType=ALLREDUCE,...

I'm trying to execute a training process with `composer run.py --config-path yamls/hydra-yamls --config-name SD-2-base-256.yaml`, after changing the configuration to use a custom data loader. Im getting some generic error AttributeError("'IterableDatasetDict'...