dreambooth
dreambooth copied to clipboard
fails on HEIC files
trafficstars
if training data includes a HEIC files, it will fail:
Traceback (most recent call last):
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/cog/server/worker.py", line 209, in _predict
result = self._predictor.predict(**payload)
File "/src/predictor.py", line 289, in predict
main(args)
File "/src/dreambooth.py", line 762, in main
for batch in tqdm(train_dataloader, desc="Caching latents"):
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/tqdm/std.py", line 1195, in __iter__
for obj in iterable:
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 681, in __next__
data = self._next_data()
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 721, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/src/dreambooth.py", line 426, in __getitem__
instance_image = Image.open(instance_path)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/PIL/Image.py", line 3186, in open
raise UnidentifiedImageError(
PIL.UnidentifiedImageError: cannot identify image file '/src/cog_instance_data/8.jpeg'
Looking at the image, identify does show that it is HEIC
$ identify 8.jpeg
8.jpeg HEIC 3024x4032 3024x4032+0+0 8-bit YCbCr 0.000u 0:00.005
Do we add support for HEIC: https://github.com/bigcat88/pillow_heif
or do we document folks not to use HEIC