gan CelebA dataset download/import issue, a workaround

CelebA dataset download/import issue, a workaround

Open chang48 opened this issue 2 years ago • 0 comments

First of all, I really enjoy reading "Make Your First GAN with PyTorch." The book explains the idea behind GAN and its training clearly. A wonderful book for anyone who is interested in this subject.

I'm creating this ticket because I was having issue accessing the CelebA dataset following the book's instructions. The issue seems to be caused by the imageio package. In particular, I got the following error:

`ValueError: Could not find a backend to open `/Users/cchang/Projects/torch/celeba_data/__MACOSX/img_align_celeba/._052628.jpg`` with iomode `ri`.
Based on the extension, the following plugins might add capable backends:
  pyav:  pip install imageio[pyav]
  opencv:  pip install imageio[opencv]`

I installed the plugins as instructed, then this came up:

`ValueError: Could not find a backend to open `xxx.jpg` with iomode `ri`.`

Anyhow, in order to proceed, I decided to download the CelebA dataset from the source (either from the link provided in the book or from kaggle), and created the h5py file as follows:

import matplotlib.image as mpimg

hdf5_file = 'celeba_aligned_small.h5py'
with h5py.File(hdf5_file, 'w') as hf:
    for i in range(1, 20000):
        img = mpimg.imread('./img_align_celeba/{0:06d}.jpg'.format(i))
        hf.create_dataset('img_align_celeba/{0:06d}.jpg'.format(i), 
                                  data=img, 
                                  compression="gzip",
                                  compression_opts=9)
        if (i%1000 == 0):
            print("images done .. ", count)
            pass

Note that

I used matplotlib's image class for reading the CelebA image files. There are many other options available for this.
I followed CelebA's file naming convention. As a result, the book's CelebADataset() class needs a little adjustment for the convention.

Hope this can be of a little help for those who also encountered a similar issue.

Feb 07 '23 00:02 chang48

gan gan copied to clipboard

CelebA dataset download/import issue, a workaround

gan
gan copied to clipboard