OneTrainer icon indicating copy to clipboard operation
OneTrainer copied to clipboard

[Feat]: Custom Max Token Length for SD/SDXL

Open ja1496 opened this issue 1 year ago • 3 comments

Describe your use-case.

Increase the max number of tokens supported by SD1.5 and SDXL from 75.

What would you like to see as a solution?

Increased token length support even if its a little hacky.

Have you considered alternatives? List them here.

No response

ja1496 avatar Aug 11 '24 23:08 ja1496

This would be great. There was a whitepaper on DALL E 3 where they used descriptive synthetic captions to improve model. Default 75 tokens is quite limiting in this regard.

keclee avatar Aug 15 '24 13:08 keclee

I have created code in this fork (https://github.com/celll1/OneTrainer/tree/dev) that supports token lengths of up to (75 tokens x) 3 chunks for the Text Encoder. It has been confirmed to work with SDXL LoRA.

Please note that the handling of BOS/EOS tokens differs in the implementation of sd-scripts. I am not confident about whether or not an attention mask should be applied.

celll1 avatar Sep 03 '24 09:09 celll1

I have created code in this fork (https://github.com/celll1/OneTrainer/tree/dev) that supports token lengths of up to (75 tokens x) 3 chunks for the Text Encoder. It has been confirmed to work with SDXL LoRA.

Please note that the handling of BOS/EOS tokens differs in the implementation of sd-scripts. I am not confident about whether or not an attention mask should be applied.

Hop on the discord and ask Nerogar, once youve discussed it with him and hes reviewed (and it comes behind a flag/checkbox) I am confident he would accept a PR.

O-J1 avatar Sep 03 '24 09:09 O-J1