dreambooth icon indicating copy to clipboard operation
dreambooth copied to clipboard

fails on HEIC files

Open anotherjesse opened this issue 2 years ago • 0 comments
trafficstars

if training data includes a HEIC files, it will fail:

Traceback (most recent call last):
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/cog/server/worker.py", line 209, in _predict
result = self._predictor.predict(**payload)
File "/src/predictor.py", line 289, in predict
main(args)
File "/src/dreambooth.py", line 762, in main
for batch in tqdm(train_dataloader, desc="Caching latents"):
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/tqdm/std.py", line 1195, in __iter__
for obj in iterable:
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 681, in __next__
data = self._next_data()
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 721, in _next_data
data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/src/dreambooth.py", line 426, in __getitem__
instance_image = Image.open(instance_path)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/PIL/Image.py", line 3186, in open
raise UnidentifiedImageError(
PIL.UnidentifiedImageError: cannot identify image file '/src/cog_instance_data/8.jpeg'

Looking at the image, identify does show that it is HEIC

$ identify 8.jpeg 
8.jpeg HEIC 3024x4032 3024x4032+0+0 8-bit YCbCr 0.000u 0:00.005

Do we add support for HEIC: https://github.com/bigcat88/pillow_heif

or do we document folks not to use HEIC

anotherjesse avatar Dec 30 '22 14:12 anotherjesse