Compel fails with TypeError: 'NoneType' object cannot be interpreted as integer when using T5Tokenizer
This issue likely affects any tokenizer that does not define bos_token of which T5Tokenizer is an example:
Traceback (most recent call last):
File "/workspaces/diffuser-tests/compel-test.py", line 9, in <module>
prompt_embeds = compel("An astronaut riding a green+ horse")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.local/lib/python3.11/site-packages/compel/compel.py", line 135, in __call__
output = self.build_conditioning_tensor(text_input)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.local/lib/python3.11/site-packages/compel/compel.py", line 112, in build_conditioning_tensor
conditioning, _ = self.build_conditioning_tensor_for_conjunction(conjunction)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.local/lib/python3.11/site-packages/compel/compel.py", line 186, in build_conditioning_tensor_for_conjunction
this_conditioning, this_options = self.build_conditioning_tensor_for_prompt_object(p)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.local/lib/python3.11/site-packages/compel/compel.py", line 218, in build_conditioning_tensor_for_prompt_object
return self._get_conditioning_for_flattened_prompt(prompt), {}
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.local/lib/python3.11/site-packages/compel/compel.py", line 282, in _get_conditioning_for_flattened_prompt
return self.conditioning_provider.get_embeddings_for_weighted_prompt_fragments(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.local/lib/python3.11/site-packages/compel/embeddings_provider.py", line 119, in get_embeddings_for_weighted_prompt_fragments
tokens, per_token_weights, mask = self.get_token_ids_and_expand_weights(fragments, weights, device=device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.local/lib/python3.11/site-packages/compel/embeddings_provider.py", line 280, in get_token_ids_and_expand_weights
return self._chunk_and_pad_token_ids(all_token_ids, all_token_weights, device=device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.local/lib/python3.11/site-packages/compel/embeddings_provider.py", line 318, in _chunk_and_pad_token_ids
all_token_ids_tensor = torch.tensor(all_token_ids, dtype=torch.long, device=device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'NoneType' object cannot be interpreted as an integer
Steps to reproduce use PixArt-alpha/PixArt-XL-2-1024-MS that uses T5Tokenizer:
from diffusers import AutoPipelineForText2Image
from compel import Compel
pipe = AutoPipelineForText2Image.from_pretrained("PixArt-alpha/PixArt-XL-2-1024-MS", use_safetensors=True)
pipe.to("cpu")
compel = Compel(tokenizer=pipe.tokenizer, text_encoder=pipe.text_encoder, device="cpu")
prompt_embeds = compel("An astronaut riding a green+ horse")
neg_prompt_embeds = compel("painting++")
result = pipe(
prompt=None,
prompt_embeds=prompt_embeds,
negative_prompt=None,
negative_prompt_embeds=neg_prompt_embeds,
num_images_per_prompt=1,
num_inference_steps=15,
height=1024,
width=1024,
output_type="pil"
)
result.images[0].save('image.png', "PNG")
Debug of the code suggests the problem lies here and here a simple hacky fix is to replace:
[self.tokenizer.bos_token_id]
with
([self.tokenizer.pad_token_id] if self.tokenizer.bos_token_id is None else [self.tokenizer.bos_token_id])
However I suspect this logic will only work for the T5 Tokenizer which has the following note in the docs
Note that T5 uses the pad_token_id as the decoder_start_token_id, so when doing generation without using [generate()](https://huggingface.co/docs/transformers/v4.37.2/en/main_classes/text_generation#transformers.GenerationMixin.generate), make sure you start it with the pad_token_id.
Unfortunately it still does not work with PixelArt-Alpha as that pipeline expects an attention mask when passing in embeds directly and that will also need wider changes to Compel to support
Rather than large scale changes it might be better in the short term to do a type check on the constructor for Compel to assert that the tokenizers are CLIPTokenizers as per the method signature - as python doesn't actually enforce this at runtime hence the confusing down stream errors. Perhaps an update to the README.md file as well would be useful as it seems to suggest compel will work with any Tokenizer but that is evidently not the case.
thanks for the comments, yes, type checking would probably be a good idea.
Hi @damian0815 Was it fixed? Do we still need compel for T5 encoder? I met the same issue with flux.1-dev
Looks like compel is still needed, what should I do now?
Compel doesn't support T5, only CLIP