Kandinsky-2 icon indicating copy to clipboard operation
Kandinsky-2 copied to clipboard

How to use the ddim_reverse_sample method?

Open tikitong opened this issue 2 years ago • 0 comments
trafficstars

Hi, I have a question about using the DDIM inversion scheduler with Kandinsky 2_1.

Can it be directly operated with this method?

...
    @torch.no_grad()
    def test(    
        self,
        num_steps = 100,
        batch_size = 1,
        guidance_scale = 4,
        h = 768,
        w = 768,
        sampler = 'ddim_sampler',
        prior_cf_scale = 4,
        prior_steps = "5" ,
        negative_prior_prompt = "",
        negative_decoder_prompt = ""):
        
        config = deepcopy(self.config)
        diffusion = create_gaussian_diffusion(**config["diffusion_config"])

        init_image = Image.open('images/test.png')
        image = prepare_image(pil_image=init_image, h=h, w=w).to(self.device)
        if self.use_fp16:
            image = image.half()
        image = self.image_encoder.encode(image) * self.scale
        print(image.shape, image.dtype)
        
        t = torch.tensor([diffusion.timestep_map[999]]).to(self.device)
        print(t, t.shape)
        
        image_noise = diffusion.ddim_reverse_sample(model=self.model, x=image, t=t)
        ...

making attention of type 'vanilla' with 512 in_channels
making attention of type 'vanilla' with 512 in_channels
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
torch.Size([1, 4, 96, 96]) torch.float16
tensor([999], device='cuda:0') torch.Size([1])
Traceback (most recent call last):
  File "/home/kandinsky2/code/bdAi/test.py", line 282, in <module>
    model.test()
  File "/home/kandinsky2/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/kandinsky2/code/bdAi/test.py", line 260, in test
    image_noise = diffusion.ddim_reverse_sample(model=self.model, x=image, t=t)
  File "/home/kandinsky2/lib/python3.9/site-packages/kandinsky2/model/gaussian_diffusion.py", line 535, in ddim_reverse_sample
    out = self.p_mean_variance(
  File "/home/kandinsky2/lib/python3.9/site-packages/kandinsky2/model/respace.py", line 102, in p_mean_variance
    return super().p_mean_variance(self._wrap_model(model), *args, **kwargs)
  File "/home/kandinsky2/lib/python3.9/site-packages/kandinsky2/model/gaussian_diffusion.py", line 251, in p_mean_variance
    model_output = model(x, s_t, **model_kwargs)
  File "/home/kandinsky2/lib/python3.9/site-packages/kandinsky2/model/respace.py", line 133, in __call__
    return self.model(x, new_ts, **kwargs)
  File "/home/kandinsky2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/kandinsky2/lib/python3.9/site-packages/kandinsky2/model/text2im_model2_1.py", line 88, in forward
    text_outputs = self.get_text_emb(
  File "/home/kandinsky2/lib/python3.9/site-packages/kandinsky2/model/text2im_model2_1.py", line 61, in get_text_emb
    clip_seq = self.clip_to_seq(image_emb).reshape(
  File "/home/kandinsky2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/kandinsky2/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
TypeError: linear(): argument 'input' (position 1) must be Tensor, not NoneType

In a similar way to the method generate_img2img. For example, if I want to have the inverted latent for the image embedding at the output of the encoder.

I am not sure of my understanding. Here the argument model in diffusion.ddim_reverse_sample, can it be directly self.model?

And for the argument t, if we want the same number of steps, so 100?

After that, I want to pass as an argument to the generate_img( method as noise.

tikitong avatar May 17 '23 16:05 tikitong