DeepSVG for text-conditioned vector generation.

Open nd7141 opened this issue 2 years ago • 1 comments

I wonder if it's possible to adapt DeepSVG to replace the VAE block in stable diffusion to generate vector graphics?

I see a couple of problems.

The latent embedding size in DeepSVG (256) does not match latent embedding size of SD (64).
diffusers library expects bin file instead of pth. There is a script to convert it to diffusers but it seems to use AutoencoderKL, which I'm not sure the right architecture.

I wonder if you know an easy way to adopt DeepSVG for diffusers library?

Mar 24 '23 10:03 nd7141

That's a great idea. Although I wonder if training a text based LM over the SVG source code dataset would be a better way to go about this, I don't know. Edit: I managed to find a project called VectorFusion which generates SVG from text description using the diffusion model. The authors have a paper on arXiv but they have not published their code unfortunately. The main author has an old github repository which does something similar, but I haven't tried it yet.

Mar 29 '23 12:03 PranavSudersan