PIL.UnidentifiedImageError: cannot identify image file
Describe the bug
The tags are listed in a text file with the same name as the image file, but the text file is mistakenly opened as an image file and an error occurs.
Reproduction
accelerate launch --num_cpu_threads_per_process 8 diffusers\examples\dreambooth\train_dreambooth.py ^ --pretrained_model_name_or_path=models/%DUMP_MODEL% ^ --instance_data_dir=data/%INSTANCE_DIR% ^ --class_data_dir=data/%CLASS_DIR% ^ --output_dir=models/%OUTPUT_DIR% ^ --with_prior_preservation --prior_loss_weight=1.0 ^ --instance_prompt="%INSTANCE_NAME%" ^ --class_prompt="%CLASS_NAME%" ^ --n_save_sample=1 ^ --save_sample_prompt=%SAMPLE_PROMPT% ^ --save_sample_negative_prompt=%SAMPLE_NG_PROMPT% ^ --save_infer_steps=30 ^ --save_guidance_scale=7 ^ --seed=1 ^ --resolution=512 ^ --train_batch_size=1 ^ --gradient_accumulation_steps=1 --gradient_checkpointing ^ --learning_rate=5e-6 ^ --mixed_precision="bf16" ^ --lr_scheduler="constant" ^ --lr_warmup_steps=0 ^ --max_train_steps=1000 ^ --save_interval=100 ^ --log_interval=10 ^ --use_8bit_adam ^ --pad_tokens ^ --train_text_encoder ^ --not_cache_latents
Logs
CUDA SETUP: Loading binary D:\Diffusers-DB\venv_db\lib\site-packages\bitsandbytes\libbitsandbytes_cuda116.dll...
Steps: 0%| | 0/1000 [00:00<?, ?it/s]Traceback (most recent call last):
File "D:\Diffusers-DB\diffusers\examples\dreambooth\train_dreambooth.py", line 822, in <module>
main(args)
File "D:\Diffusers-DB\diffusers\examples\dreambooth\train_dreambooth.py", line 736, in main
for step, batch in enumerate(train_dataloader):
File "D:\Diffusers-DB\venv_db\lib\site-packages\accelerate\data_loader.py", line 348, in __iter__
current_batch = next(dataloader_iter)
File "D:\Diffusers-DB\venv_db\lib\site-packages\torch\utils\data\dataloader.py", line 681, in __next__
data = self._next_data()
File "D:\Diffusers-DB\venv_db\lib\site-packages\torch\utils\data\dataloader.py", line 721, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "D:\Diffusers-DB\venv_db\lib\site-packages\torch\utils\data\_utils\fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "D:\Diffusers-DB\venv_db\lib\site-packages\torch\utils\data\_utils\fetch.py", line 49, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "D:\Diffusers-DB\diffusers\examples\dreambooth\train_dreambooth.py", line 336, in __getitem__
class_image = Image.open(class_path)
File "D:\Diffusers-DB\venv_db\lib\site-packages\PIL\Image.py", line 3147, in open
raise UnidentifiedImageError(
PIL.UnidentifiedImageError: cannot identify image file 'D:\\Diffusers-DB\\data\\any-1girl_pre\\00018-0-00018-2943085617.txt'
System Info
-
diffusersversion: 0.7.2 - Platform: Windows-10-10.0.22621-SP0
- Python version: 3.10.7
- PyTorch version (GPU?): 1.12.1+cu113 (True)
- Huggingface_hub version: 0.10.0
- Transformers version: 4.23.1
- Using GPU in script?:
- Using distributed or parallel set-up in script?: