Dataset Processing Nodes and Improved LoRA Trainer Nodes with multi resolution supports.
In this PR I proposed a set of Dataset Process Nodes allow user to load/save image-text datasets and pre-calc the latent/conditions and save them to the hard disk. Allow easier management and shorter startup time in training workflow.
About improvement in Trainer Node, now it support multi-resolution by fwd multiple time on bs=1 input than do bwd after multiple fwd to "simulate" the behavior of bs>1. (Note, this is different from gradient accumulation)
In future we can consider to add optional bucketing behavior but current implementation should be enough.
Here is some screenshots on example workflow (the image's PNG info contain the workflow info):
- Resize all images to same pixel count (while different resolution/aspect ratio) and calculate the latents + text embeddings then saving them.
- Utilize saved training set from 1. to train LoKr model
This PR resolved:
- proper progressbar usage to resolve #8668
- dataset nodes system to resolve #9824 and #9742
cc @comfyanonymous
As suggested by @comfyanonymous, we will allow xxxProcessingNode to have is_input_list=True optional, allow easier implementation and easier usage.
Also I will make an example for how to create custom xxxProcessing in custom node implementation
therefore this PR is draft for now until above tasks are done
Update: I made an example repository for how to make custom image/text processing node for our dataset processing system:
https://github.com/KohakuBlueleaf/Comfy-DSNode-Example
I think this PR is ready to merge now.