Kandinsky-2
Kandinsky-2 copied to clipboard
How to use the ddim_reverse_sample method?
trafficstars
Hi, I have a question about using the DDIM inversion scheduler with Kandinsky 2_1.
Can it be directly operated with this method?
...
@torch.no_grad()
def test(
self,
num_steps = 100,
batch_size = 1,
guidance_scale = 4,
h = 768,
w = 768,
sampler = 'ddim_sampler',
prior_cf_scale = 4,
prior_steps = "5" ,
negative_prior_prompt = "",
negative_decoder_prompt = ""):
config = deepcopy(self.config)
diffusion = create_gaussian_diffusion(**config["diffusion_config"])
init_image = Image.open('images/test.png')
image = prepare_image(pil_image=init_image, h=h, w=w).to(self.device)
if self.use_fp16:
image = image.half()
image = self.image_encoder.encode(image) * self.scale
print(image.shape, image.dtype)
t = torch.tensor([diffusion.timestep_map[999]]).to(self.device)
print(t, t.shape)
image_noise = diffusion.ddim_reverse_sample(model=self.model, x=image, t=t)
...
making attention of type 'vanilla' with 512 in_channels
making attention of type 'vanilla' with 512 in_channels
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
torch.Size([1, 4, 96, 96]) torch.float16
tensor([999], device='cuda:0') torch.Size([1])
Traceback (most recent call last):
File "/home/kandinsky2/code/bdAi/test.py", line 282, in <module>
model.test()
File "/home/kandinsky2/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/kandinsky2/code/bdAi/test.py", line 260, in test
image_noise = diffusion.ddim_reverse_sample(model=self.model, x=image, t=t)
File "/home/kandinsky2/lib/python3.9/site-packages/kandinsky2/model/gaussian_diffusion.py", line 535, in ddim_reverse_sample
out = self.p_mean_variance(
File "/home/kandinsky2/lib/python3.9/site-packages/kandinsky2/model/respace.py", line 102, in p_mean_variance
return super().p_mean_variance(self._wrap_model(model), *args, **kwargs)
File "/home/kandinsky2/lib/python3.9/site-packages/kandinsky2/model/gaussian_diffusion.py", line 251, in p_mean_variance
model_output = model(x, s_t, **model_kwargs)
File "/home/kandinsky2/lib/python3.9/site-packages/kandinsky2/model/respace.py", line 133, in __call__
return self.model(x, new_ts, **kwargs)
File "/home/kandinsky2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/kandinsky2/lib/python3.9/site-packages/kandinsky2/model/text2im_model2_1.py", line 88, in forward
text_outputs = self.get_text_emb(
File "/home/kandinsky2/lib/python3.9/site-packages/kandinsky2/model/text2im_model2_1.py", line 61, in get_text_emb
clip_seq = self.clip_to_seq(image_emb).reshape(
File "/home/kandinsky2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/kandinsky2/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
TypeError: linear(): argument 'input' (position 1) must be Tensor, not NoneType
In a similar way to the method generate_img2img. For example, if I want to have the inverted latent for the image embedding at the output of the encoder.
I am not sure of my understanding. Here the argument model in diffusion.ddim_reverse_sample, can it be directly self.model?
And for the argument t, if we want the same number of steps, so 100?
After that, I want to pass as an argument to the generate_img( method as noise.