diffusers-ocaml icon indicating copy to clipboard operation
diffusers-ocaml copied to clipboard

Diffusers API in OCaml

  • Diffusers-OCaml: A Diffusers API in OCaml

This work is a port of the [[https://github.com/LaurentMazare/diffusers-rs/][diffusers-rs]] library written in Rust.

** Stable-Diffusion *** How to run? - Download weights and save it data directory. For instructions on how to do this, refer [[https://github.com/LaurentMazare/diffusers-rs#converting-the-original-weight-files][diffusers-rs-docs]] - Once you have ".ot" files, you are ready to run stable diffusion - I have found that placing unet and clip on GPU, and placing vae on cpu works the best for my GPU. Here is the command that does that. #+begin_src bash dune exec stable_diffusion -- generate "lighthouse at dark" "vae" "data/pytorch_model.ot" "data/vae.ot" "data/unet.ot" #+end_src - To run all models in CPU #+begin_src bash dune exec stable_diffusion -- generate "lighthouse at dark" "all" "data/pytorch_model.ot" "data/vae.ot" "data/unet.ot" #+end_src - To generate more than 1 sample, use the num_samples parameter #+begin_src bash dune exec stable_diffusion -- generate "lighthouse at dark" "all" "data/pytorch_model.ot" "data/vae.ot" "data/unet.ot" --num_samples=2 #+end_src

*** Sample generated images #+CAPTION: lighthouse at dark #+NAME: fig:lighthouse.png [[./media/lighthouse.png]] #+CAPTION: rusty robot holding a torch #+NAME: fig:rusty_robot.png [[./media/sd_final.2.png]] *** Performance on 8GB Nvidia GeForce GTX 1070 Mobile GPU It takes about 27 seconds to generate an image. Measurements were done on a 12 CPU Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz. Running all the steps in CPU takes little more than three minutes. I place vae on CPU; unet and clip on GPU #+begin_src bash real 0m25.110s user 0m39.271s sys 0m11.518s #+end_src ** Image to Image generation *** How to run? #+begin_src bash dune exec img2img -- img2img media/in_img2img.jpg #+end_src *** Sample generated image

  • Input image #+CAPTION: img2img #+NAME: fig:in_img2img.png [[./media/in_img2img.jpg]]
  • Output image #+CAPTION: img2img #+NAME: fig:out_img2img.png [[./media/out_img2img.png]] *** Performance on 8GB Nvidia GeForce GTX 1070 Mobile GPU I placed vae on CPU; unet and clip on GPU #+begin_src bash real 0m15.628s user 0m34.571s sys 0m5.833s #+end_src ** Inpaint *** How to run? #+begin_src bash dune exec inpaint -- generate media/sd_input.png media/sd_mask.png --cpu="vae" --prompt="Face of a panda, high resolution, sitting on a park bench" #+end_src *** Sample generated image
  • Prompt: Face of a panda, high resolution, sitting on a park bench #+CAPTION: inpaint-input #+NAME: fig:sd_input.png [[./media/sd_input.png]] #+CAPTION: inpaint-mask #+NAME: fig:sd_mask.png [[./media/sd_mask.png]] #+CAPTION: inpaint-output #+NAME: fig:panda.png [[./media/panda.png]] *** Performance I placed vae on CPU; unet and clip on GPU #+begin_src bash real 0m28.055s user 0m51.600s sys 0m13.851s #+end_src ** Note Only FP32 weights are supported.