latent-diffusion icon indicating copy to clipboard operation
latent-diffusion copied to clipboard

Reproducibility of FID scores

Open LuChengTHU opened this issue 1 year ago • 10 comments

I have retested the FID score by torch-fidelity on FFHQ and CelebA-HQ with the default DDIM algorithm and the recommended steps (200 / 500), and it gives a much worse FID score (about 9.+) than the results reported in the original paper. Is there any difference between the given checkpoint and the checkpoint used in Table 1?

LuChengTHU avatar Aug 28 '22 07:08 LuChengTHU

Hi @LuChengTHU , could you share your script to evaluate FID on FFHQ? Thanks!

Ir1d avatar Aug 28 '22 13:08 Ir1d

@Ir1d I just use the official command in the repo of torch-fidelity.

LuChengTHU avatar Aug 28 '22 14:08 LuChengTHU

Hi @LuChengTHU, have you reproduced the results reported in the original paper? I sampled 50k images using the provided checkpoint on LSUN_Chruches, and calculated FID via torch-fidelity, but the FID was 15.89, much worse than the FID reported in the original paper(4.02).

ader47 avatar Mar 23 '23 07:03 ader47

same problem here, do not have a clue. Does multi-gpu generation matter?

ThisisBillhe avatar Apr 25 '23 12:04 ThisisBillhe

To all: I haven't solve it yet... :(

LuChengTHU avatar May 07 '23 16:05 LuChengTHU

Same here. I got FID 9.36 on FFHQ using the checkpoints and scripts provided in this repo

zengxianyu avatar May 15 '23 18:05 zengxianyu

@ader47 Have you reproduced the FFHQ fid results?

Not yet

ader47 avatar May 20 '23 05:05 ader47

same here

FFHQ 256, 50k images 200 DDIM steps

image

notou10 avatar Jun 12 '23 06:06 notou10

hey guys @notou10 @zengxianyu @ader47 @LuChengTHU can you share which data source of FFHQ 256 you used as the reference batch to compute FID?

I am thinking of using this dataset source: https://www.kaggle.com/datasets/denislukovnikov/ffhq256-images-only for FID computation

forever208 avatar Aug 04 '23 21:08 forever208

I successfully reproduced FFHQ’s FID from the paper. by using following code to preprocess the reference batch(ffhq1024).

import os import shutil import bisect import numpy as np import albumentations from PIL import Image from torch.utils.data import Dataset, ConcatDataset

class ImagePaths(Dataset): def init(self, paths, size=None, random_crop=False, labels=None): self.size = size self.random_crop = random_crop self.labels = dict() if labels is None else labels self.labels["file_path_"] = paths self._length = len(paths)

    if self.size is not None and self.size > 0:
        self.rescaler = albumentations.SmallestMaxSize(max_size = self.size)
        if not self.random_crop:
            self.cropper = albumentations.CenterCrop(height=self.size,width=self.size)
        else:
            self.cropper = albumentations.RandomCrop(height=self.size,width=self.size)
        self.preprocessor = albumentations.Compose([self.rescaler, self.cropper])
    else:
        self.preprocessor = lambda **kwargs: kwargs

def __len__(self):
    return self._length

def preprocess_image(self, image_path):
    image = Image.open(image_path)
    if not image.mode == "RGB":
        image = image.convert("RGB")
    image = np.array(image).astype(np.uint8)
    image = self.preprocessor(image=image)["image"]
    image = (image/127.5 - 1.0).astype(np.float32)
    return image

def __getitem__(self, i):
    example = dict()
    example["image"] = self.preprocess_image(self.labels["file_path_"][i])
    for k in self.labels:
        example[k] = self.labels[k][i]
    return example

def preprocess_and_copy_selected_images(txt_file, source_directory, new_target_directory, image_size): with open(txt_file, 'r') as file: image_names = file.read().splitlines()

image_paths = [os.path.join(source_directory, name) for name in image_names if os.path.exists(os.path.join(source_directory, name))]
dataset = ImagePaths(paths=image_paths, size=image_size, random_crop=False)

for i in range(len(dataset)):

    preprocessed_image = dataset[i]["image"]
    image_name = os.path.basename(dataset.labels["file_path_"][i])


    Image.fromarray((preprocessed_image * 127.5 + 127.5).astype(np.uint8)).save(os.path.join(new_target_directory, image_name))

print(f"Preprocessed images have been saved to {new_target_directory}")

duchengbin8 avatar Jan 26 '24 03:01 duchengbin8