Sana
Sana copied to clipboard
Pre-emptive Feature anticipation
Probably gonna shortlist some wonky idea, but hey if this tool will be workable anywhere it better be feature-full
- [ ] Finetuning and LoRA (or other PEFT type) training toolkit https://github.com/Nerogar/OneTrainer
- [ ] LoRA (or other PEFT type) and model merging (or extraction) https://github.com/hako-mikan/sd-webui-supermerger
- [ ] Getting with X-Adapter in case people have LoRAs of other base models https://github.com/showlab/X-Adapter
- [ ] CFG manipulation cus it's a thing https://github.com/Extraltodeus/ComfyUI-AutomaticCFG/discussions/53
- [ ] InvokeAI and other inpainting tools (e.g. Krita) that allows for creative exploration https://github.com/invoke-ai/InvokeAI
- [ ] Hacks like LCM, Lightning, and Turbo (yes imagine the 100x get another zero on top cus why not?)
- [ ] ControlNet and other related tooling for allowing more flexibility
- [ ] OpenPose and facial landmarks for character generation
- [ ] depth maps and normal map for 3D-aided generation
- [ ] line art or segmentation for illustrations and cartoons
- [ ] LLM tooling for getting Sana to talk with other models that can possibly improve UX or help with finetuning
Nice one. We will work on the function supporting works including the ones you listed above during this whole year. Also we will keep improving the quality of DC-AE and Sana, planning to release like v1.5 in the futher. Trying to make the high-compression tech popular.
I need to add some features too as I had been using PixArt-Sigma and found some success with it.
- Achieve a similar midjourney-style aesthetic quality, currently found in PixArt-Sigma.
- Include image datasets that were used for training PixArt-Sigma model.
Definitely, I am willing to integrate these features.
@doogyhatts please do not have aesthetic locking, since coming from the SDXL derivative side of things, people have really put in effort to make everything aesthetically flexible.
@doogyhatts please do not have aesthetic locking, since coming from the SDXL derivative side of things, people have really put in effort to make everything aesthetically flexible.
Simply have different base models, including one that does not have any style-specific datasets applied to it.
@doogyhatts here is the thing tho... if there is a way to create a style embedding for all major implicit styles (looking at the current and future methodology of Pony), then the problem of style bias becomes trivial, and finetuning (or pruning) would be easier. LoRA extractors kind of do this already.
my boss that me use this model but no lora train, I no idea finish work.
help me please.
@lawrence-cj Advent calendar or bust, bro XD 🕙 but seriously make a timeline of this with conservative estimates would be sweet
I'm not resting. You tell me the priority or do something to help, pls? @TomLucidor
https://github.com/city96/ComfyUI_ExtraModels/pull/84 https://github.com/huggingface/diffusers/pull/9982 https://github.com/huggingface/diffusers/pull/9708 BF16 model fine-tuning https://github.com/bghira/SimpleTuner/pull/1187
LoRA is supported in diffusers
refer to:
https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/README_sana.md
and
Official diffusers docs: https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/README_sana.md