stable-diffusion icon indicating copy to clipboard operation
stable-diffusion copied to clipboard

How to save internal representation (latent-space vector) of the image for later modification?

Open ProkopHapala opened this issue 3 years ago • 2 comments

I would like to store latent-space vector representation of the image so that I can use it later as starting point and modify it using different text prompts

------------------ more details ------------------ I'm looking on script https://github.com/CompVis/stable-diffusion/blob/main/scripts/img2img.py as I understand it functions like this:

z_enc = sampler.stochastic_encode(init_latent, torch.tensor([t_enc]*batch_size).to(device)) which transform from source image to latent-space vector (i.e. vector of concepts/ideas)

The diffusion/purification is done on this latent space vector applying the text prompt conditions

c = model.get_learned_conditioning(prompts)
samples = sampler.decode(z_enc, c, t_enc, unconditional_guidance_scale=opt.scale,unconditional_conditioning=uc,)

here, I would like to store latent space vector for later use (That is after it is biased/converged toward the text prompt, but before it it transformed back to image space).

x_samples = model.decode_first_stage(samples) transforms back from latent space to image-space to produce final image

Sorry if this is not proper place to ask such question (maybe this is just for error reports). In that case please let me now where is possible to discuss and obtain help. Thank you

ProkopHapala avatar Sep 04 '22 10:09 ProkopHapala

You can store samples from samples = sampler.decode(z_enc, c, t_enc, unconditional_guidance_scale=opt.scale,unconditional_conditioning=uc,) and later use it instead of init_latent here z_enc = sampler.stochastic_encode(init_latent, torch.tensor([t_enc]*batch_size).to(device))

KostyaAtarik avatar Sep 08 '22 10:09 KostyaAtarik

@KostyaAtarik I wonder if you ever used this vector representation to calculate image similarity between two generated images?

hosseinsarshar avatar Feb 11 '23 01:02 hosseinsarshar