question about data preprocessing?
I see in the data preprocess, you normalize the control image to 0-1, but the target image to -1-1,in face, the cv2->float will brought some 0.5 error delta as you minus 127.5,why don't you use the same normalize method for the source and image. And anthor question, in your scripts it unsupport multi-size images, all images must resize to some fixed size by some methods, could you please give some advice to train with large amout of different size images, (eg. 1:4-->4:1)
I have the same question, is there some reason behind this choice? Diffusers controlnet training also does it in the same way-
image_transforms = transforms.Compose(
[
transforms.Resize(args.resolution, interpolation=transforms.InterpolationMode.BILINEAR),
transforms.CenterCrop(args.resolution),
transforms.ToTensor(),
transforms.Normalize([0.5], [0.5]),
]
)
conditioning_image_transforms = transforms.Compose(
[
transforms.Resize(args.resolution, interpolation=transforms.InterpolationMode.BILINEAR),
transforms.CenterCrop(args.resolution),
transforms.ToTensor(),
]
)
I have the same question
The model is not trained at all. More precisely, it does not respond when checking for input images...
The model is not trained at all. More precisely, it does not respond when checking for input images... it takes longer to train. 5000-10000 steps