ControlNet
ControlNet copied to clipboard
controlNet - convert sketch circuit to vector-engineered circuit,some question
I want to achieve what is shown in the diagram below, converting a hand-drawn circuit diagram into a vector-engineered diagram. eg.
all the controlText is : {"text": "Convert this sketch circuit diagram into a standard vectorized circuit diagram created with software", "image": "images/1000_1_360015968.jpg", "conditioning_image": "conditioning_images/1000_1_360015968.jpg"} I have prepared 50,000 similar data pairs, with a sketch circuit diagram corresponding to an identical vector image. Then, I trained ControlNet with
batchsize=1(my gpu is one 3090),
lr = 2e-6,
precision=32,
accumulate_grad_batches=4,
sd_locked = False,
only_mid_control = False,
Roughly after 2 epochs, there wasn't much change in the test simple images. The input sketch didn't seem to have any controlling effect, and there was a tendency for overfitting. Regardless of the input sketch, the output results were random combinations of circuit diagrams.
here are some egs:
text: Convert this sketch circuit diagram into a standard vectorized circuit diagram created with software
control image:
output image:
can some one tell me some advice? thanks!
我想实现的内容就是把手绘的物理电路图,通过controlNet控制输出类似工程软件绘制的电路图,示例图如上面的所以, 我目前准备了50K的手绘草图和对应的矢量图,找人一比一绘制出来的。 控制的text全部都是固定成了一句话,“把手绘的电路图转成工程的矢量图”,
在2个epoc之后,基本上就稳定了,不管输入的是什么控制图,输出的都是随机组合的电路图,样式上还可以,但是就是没有起到控制作用,跟输入的草图完全不一样!
Hello!
First of all lock your SD. Which seems to to be unlocked. Second use increased batch size i.e 64, using the gradient accumulation. Increased batch size will cost you more training time but better results.
Friend, have you solved your problem? I am also facing the same problem as you, but I still can't find the answer. So I want to ask you, maybe you can give me some hints and help.Thank you.I want to control the output image just like you. I tried the method mentioned in the comments, but it didn't work. The output images are random and chaotic, and have nothing to do with the input image.I'm going crazy.
I have similar problems. Have you solved it?
Maybe Diffusion Model is hard to recognize circuit components as it to texts or charaters,when you try to reconstruct images with them.
个人猜测:电路元件比较像文字符号,diffusion 模型这一系列我感觉它看不懂文字符号,没有办法细致地重构图像里文字,同样也不能很好地理解电路元件的排列。