compel Support for SD3

Support for SD3

Jun 13 '24 02:06 adhikjoshi

Wondering the same. Just getting SD3 integrated into DiffusionDeluxe and wondering if I should try the same compel code I had working for SDXL, only without the refiner. The difference with the 3 pipeline is we have 3 tokenizers and text_encoders instead of 2, and not sure if it'll work with passing only 2 or if it's gonna need some reworking. I'm gonna try it out, but I get the impression it's not gonna work immediately with the new architecture, fingers crossed..

Jun 13 '24 06:06 Skquark

i unfortunately do not have the resources to update compel for SD3. i'd be happy to accept a pull request if someone wanted to figure out how to do it. keeping up with AI dev is exhausting and i'm not getting paid to do this.

Jun 15 '24 08:06 damian0815

I'm thinking of working on this! If anyone is interested on working on this together, shoot me an email @ [email protected] :)

Jun 20 '24 18:06 AbhinavGopal

@damian0815 How can we help you ?

Jun 28 '24 13:06 MohamedAliRashad

figure out what needs to be done to support SD3 and do it :D

Jun 28 '24 17:06 damian0815

i might see if i can spare a few hours this weekend to take a look.

Jun 28 '24 17:06 damian0815

@damian0815 You don't need to. A library named sd_embed has achieved what we want.

Jun 29 '24 22:06 MohamedAliRashad

Yes, sd_embed did SD3 support for this. I would still like to see support within compel just because it is structured a bit better IMO.

Jun 30 '24 00:06 Daniel-SicSo-Edinburgh

Also, if you are using SD3 without T5, you can use the already existing functionality with some adjustments:

path_to_file = '.../sd3_medium_incl_clips.safetensors' #path to sd3_medium_incl_clips.safetensors

model = StableDiffusion3Pipeline.from_single_file(path_to_file,
                                                  torch_dtype=torch.float16,
                                                  use_safetensors=True,
                                                  text_encoder_3=None)

prompt = 'Some prompt'
neg_prompt = 'Some negative prompt'

compeler = Compel(tokenizer=[model[1].tokenizer, model[1].tokenizer_2],
                  text_encoder=[model[1].text_encoder, model[1].text_encoder_2],
                  returned_embeddings_type=ReturnedEmbeddingsType.PENULTIMATE_HIDDEN_STATES_NON_NORMALIZED,
                  truncate_long_prompts=False,
                  requires_pooled=[True, True],
                  device="cuda")
                                            
embeds, pooled_embeds = compeler([prompt, neg_prompt])

prompt_embed = embeds[0].unsqueeze(0)
neg_prompt_embed = embeds[1].unsqueeze(0)

prompt_embed = torch.nn.functional.pad(prompt_embed, (0, 2048))
neg_prompt_embed = torch.nn.functional.pad(neg_prompt_embed, (0, 2048))

pooled_prompt_embed = pooled_embeds[0].unsqueeze(0)
pooled_neg_prompt_embed = pooled_embeds[1].unsqueeze(0)


images = model(prompt_embeds=prompt_embed,
                pooled_prompt_embeds=pooled_prompt_embed,
                negative_prompt_embeds=neg_prompt_embed,
                negative_pooled_prompt_embeds=pooled_neg_prompt_embed,
                guidance_scale=3,
                num_inference_steps=40).images
                
 images[0].save('output_image.png')

Jun 30 '24 01:06 Daniel-SicSo-Edinburgh

i have compel SD3 90% of the way there..

Jun 30 '24 21:06 damian0815

i have compel SD3 90% of the way there..

Any update?

Jul 28 '24 19:07 nom