Dreambooth-Stable-Diffusion
Dreambooth-Stable-Diffusion copied to clipboard
test_dataloader
After passing 2 Epochs, I am getting this error:
pytorch_lightning.utilities.exceptions.MisconfigurationException: No test_dataloader()
method defined to run Trainer.test
.
Here is some more context:
Epoch 0, global step 499: val/loss_simple_ema was not in top 1
Epoch 0: 100%|█| 505/505 [09:33<00:00, 1.14s/it, loss=0.276, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.71e-5, Average Epoch time: 573.97 seconds
Average Peak memory 35456.11MiB
Epoch 1: 0%| | 0/505 [00:00<?, ?it/s, loss=0.276, v_num=0, train/loss_simple_step=0.0151, train/loss_vlb_step=6.71e-5, train/loss_Data shape for DDIM sampling is (1, 4, 64, 64), eta 1.0
Running DDIM Sampling with 200 timesteps
DDIM Sampler: 100%|███████████████████████████████████████████████████████████████████████████████| 200/200 [00:22<00:00, 8.96it/s]
Data shape for DDIM sampling is (1, 4, 64, 64), eta 1.0███████████████████████████████████████████| 200/200 [00:22<00:00, 8.96it/s]
Running DDIM Sampling with 200 timesteps
DDIM Sampler: 100%|███████████████████████████████████████████████████████████████████████████████| 200/200 [00:29<00:00, 6.78it/s]
Epoch 1: 0%| | 1/505 [00:59<8:19:19, 59.44s/it, loss=0.275, v_num=0, train/loss_simple_step=0.0144, train/loss_vlb_step=6.2e-5, tr[W accumulate_grad.h:185] Warning: grad and param do not obey the gradient layout contract. This is not an error, but may impair performance.
grad.sizes() = [320, 320, 1, 1], strides() = [320, 1, 1, 1]
param.sizes() = [320, 320, 1, 1], strides() = [320, 1, 320, 320] (function operator())
Epoch 1: 59%|▌| 300/505 [06:13<04:15, 1.24s/it, loss=0.245, v_num=0, train/loss_simple_step=0.0778, train/loss_vlb_step=0.000256, Average Epoch time: 373.33 seconds
Average Peak memory 35567.64MiB
Epoch 1: 60%|▌| 301/505 [06:13<04:13, 1.24s/it, loss=0.245, v_num=0, train/loss_simple_step=0.0778, train/loss_vlb_step=0.000256,
Saving latest checkpoint...
Traceback (most recent call last):
File "main.py", line 835, in <module>
trainer.test(model, data)
File "/usr/local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 911, in test
return self._call_and_handle_interrupt(self._test_impl, model, dataloaders, ckpt_path, verbose, datamodule)
File "/usr/local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 685, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 954, in _test_impl
results = self._run(model, ckpt_path=self.tested_ckpt_path)
File "/usr/local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1128, in _run
verify_loop_configurations(self)
File "/usr/local/lib/python3.8/site-packages/pytorch_lightning/trainer/configuration_validator.py", line 42, in verify_loop_configurations
__verify_eval_loop_configuration(trainer, model, "test")
File "/usr/local/lib/python3.8/site-packages/pytorch_lightning/trainer/configuration_validator.py", line 186, in __verify_eval_loop_configuration
raise MisconfigurationException(f"No `{loader_name}()` method defined to run `Trainer.{trainer_method}`.")
pytorch_lightning.utilities.exceptions.MisconfigurationException: No `test_dataloader()` method defined to run `Trainer.test`.
I haven't encountered this as I did not train that long. We do not have a test set, and we don't even have config for test dataset. Same thing for textual inversion, so maybe just remove anything that calls a test dataset (i.e., just remove the call of trainer.test)?
It has something to do with the way the LDMs are written, I removed the trainer.test
call from main.py
and added a --no_test
, but it didn't do it, I still get the same crash.
Running it with --no-test true worked. So, that should probably be the default
For me, --no-test true
worked too. It adds to the confustion that the tool doesn't complain on unrecognized parameters, like a misspelled no_test
or if the parameter value true
is missing.