latent-diffusion
latent-diffusion copied to clipboard
Reproducibility of FID scores
I have retested the FID score by torch-fidelity on FFHQ and CelebA-HQ with the default DDIM algorithm and the recommended steps (200 / 500), and it gives a much worse FID score (about 9.+) than the results reported in the original paper. Is there any difference between the given checkpoint and the checkpoint used in Table 1?
Hi @LuChengTHU , could you share your script to evaluate FID on FFHQ? Thanks!
@Ir1d I just use the official command in the repo of torch-fidelity.
Hi @LuChengTHU, have you reproduced the results reported in the original paper? I sampled 50k images using the provided checkpoint on LSUN_Chruches, and calculated FID via torch-fidelity, but the FID was 15.89, much worse than the FID reported in the original paper(4.02).
same problem here, do not have a clue. Does multi-gpu generation matter?
To all: I haven't solve it yet... :(
Same here. I got FID 9.36 on FFHQ using the checkpoints and scripts provided in this repo
@ader47 Have you reproduced the FFHQ fid results?
Not yet
same here
FFHQ 256, 50k images 200 DDIM steps
hey guys @notou10 @zengxianyu @ader47 @LuChengTHU can you share which data source of FFHQ 256 you used as the reference batch to compute FID?
I am thinking of using this dataset source: https://www.kaggle.com/datasets/denislukovnikov/ffhq256-images-only for FID computation
I successfully reproduced FFHQ’s FID from the paper. by using following code to preprocess the reference batch(ffhq1024).
import os import shutil import bisect import numpy as np import albumentations from PIL import Image from torch.utils.data import Dataset, ConcatDataset
class ImagePaths(Dataset): def init(self, paths, size=None, random_crop=False, labels=None): self.size = size self.random_crop = random_crop self.labels = dict() if labels is None else labels self.labels["file_path_"] = paths self._length = len(paths)
if self.size is not None and self.size > 0:
self.rescaler = albumentations.SmallestMaxSize(max_size = self.size)
if not self.random_crop:
self.cropper = albumentations.CenterCrop(height=self.size,width=self.size)
else:
self.cropper = albumentations.RandomCrop(height=self.size,width=self.size)
self.preprocessor = albumentations.Compose([self.rescaler, self.cropper])
else:
self.preprocessor = lambda **kwargs: kwargs
def __len__(self):
return self._length
def preprocess_image(self, image_path):
image = Image.open(image_path)
if not image.mode == "RGB":
image = image.convert("RGB")
image = np.array(image).astype(np.uint8)
image = self.preprocessor(image=image)["image"]
image = (image/127.5 - 1.0).astype(np.float32)
return image
def __getitem__(self, i):
example = dict()
example["image"] = self.preprocess_image(self.labels["file_path_"][i])
for k in self.labels:
example[k] = self.labels[k][i]
return example
def preprocess_and_copy_selected_images(txt_file, source_directory, new_target_directory, image_size): with open(txt_file, 'r') as file: image_names = file.read().splitlines()
image_paths = [os.path.join(source_directory, name) for name in image_names if os.path.exists(os.path.join(source_directory, name))]
dataset = ImagePaths(paths=image_paths, size=image_size, random_crop=False)
for i in range(len(dataset)):
preprocessed_image = dataset[i]["image"]
image_name = os.path.basename(dataset.labels["file_path_"][i])
Image.fromarray((preprocessed_image * 127.5 + 127.5).astype(np.uint8)).save(os.path.join(new_target_directory, image_name))
print(f"Preprocessed images have been saved to {new_target_directory}")