enhancing-transformers icon indicating copy to clipboard operation
enhancing-transformers copied to clipboard

Reconstruction results

Open Marcelo5444 opened this issue 1 year ago • 4 comments

Hi, First of all thanks for you work.

Working with vit small, I see that results are far away from VQGAN, did you stop training when reached convergence? Do you think there is more room to improve the model performance/

Results with vit-small cat input image 212861459-e4113b34-622d-4602-afe4-f20e2d79425c

Marcelo5444 avatar Aug 10 '23 17:08 Marcelo5444

Can you show me your code for reconstruction?I also meet this problem that reconstruction results of the ViT-VQGAN on ImageNet are very terrible.

ghost avatar Aug 11 '23 12:08 ghost

config = OmegaConf.load('configs/imagenet_vitvq_small.yaml') model = initialize_from_config(config.model) model.init_from_ckpt('/home/marcelo/Downloads/imagenet_vitvq_small.ckpt')

def preprocess(img): s = min(img.size)

if s < 256:
    raise ValueError(f'min dim for image {s} < 256')

r = 1024 / s
s = (round(r * img.size[1]), round(r * img.size[0]))
img = TF.resize(img, s, interpolation=PIL.Image.LANCZOS)
img = TF.center_crop(img, output_size=2 * [256])
img = torch.unsqueeze(T.ToTensor()(img), 0)
return img

original=Image.open('/home/marcelo/Downloads/212861459-e4113b34-622d-4602-afe4-f20e2d79425c.png') image=preprocess(original) image = image[:,:3,:,:]

quant, _ = model.encode(image) dec = model.decode(quant)

Marcelo5444 avatar Aug 11 '23 19:08 Marcelo5444

Actually, I think the reason is the bad model checkpoint. Your script is right. I measure the rFID, it is far away from VQGAN. I also train the model on ImageNet, but it still works badly. From: @.> Date: Sat, Aug 12, 2023, 03:04 Subject: [External] Re: [thuanz123/enhancing-transformers] Reconstruction results (Issue #20) To: "thuanz123/enhancing-transformers"< @.> Cc: @.>, "Comment"< @.>

The same as the one in the colab notebook

— Reply to this email directly, view it on GitHub https://github.com/thuanz123/enhancing-transformers/issues/20#issuecomment-1675236319, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7ZMO7CSSW4EOBN4SOSLJVDXUZ63PANCNFSM6AAAAAA3L3JDYQ . You are receiving this because you commented.Message ID: @.***>

ghost avatar Aug 12 '23 13:08 ghost

So, after your training, you obtain a better model weights that improve the reconstruction?

Marcelo5444 avatar Aug 16 '23 12:08 Marcelo5444