AutomaTikZ
AutomaTikZ copied to clipboard
Text-Guided Synthesis of Scientific Vector Graphics with TikZ
AutomaTikZ
Text-Guided Synthesis of Scientific Vector Graphics with TikZ
AutomaTikZ is a software library designed for the automatic creation of scientific vector graphics using natural language descriptions. Generating vector graphics such as SVGs directly can be challenging, but AutomaTikZ simplifies the process by using TikZ, a well-known abstract graphics language that can be compiled into vector graphics, as an intermediary format. TikZ's human-oriented, high-level commands facilitate conditional language modeling with any large language model. AutomaTikZ comes with a variety of tools for working with such models.
[!NOTE] If you make use of this work please cite it.
Installation
The base version of AutomaTikZ, which already supports inference and training, can be installed as regular Python package using pip:
pip install 'automatikz[pdf] @ git+https://github.com/potamides/AutomaTikZ'
For compilation of generated code, AutomaTikZ additionally requires a full TeX Live installation, ghostscript, and, for rasterization of vector graphics, poppler.
If your goal is to run your own instance of the included web UI, clone the repository and install it in editable mode like this, instead:
git clone https://github.com/potamides/AutomaTikZ
pip install -e AutomaTikZ[webui]
Usage
As long as the required dependencies are installed, using AutomaTikZ to generate, compile, render, and save TikZ drawings is straightforward.
from automatikz.infer import TikzGenerator, load
generate = TikzGenerator(*load("nllg/tikz-clima-13b"), stream=True)
caption = (
"Visual representation of a multi-layer perceptron: "
"an interconnected network of nodes, showcasing the structure of input, "
"hidden, and output layers that facilitate complex pattern recognition."
)
tikzdoc = generate(caption) # streams generated tokens to stdout
tikzdoc.save("mlp.tex") # save the generated code
if tikzdoc.has_content: # true if generated tikzcode compiles to non-empty pdf
tikzdoc.rasterize().show() # raterize pdf to a PIL.Image and show it
tikzdoc.save("mlp.pdf") # save the generated pdf
More involved examples, both for inference and training, can be found in the examples folder.
Model Weights
We release the following weights of fine-tuned LLaMA and CLiMA language models on the Hugging Face Model Hub:
- CLiMA7b: nllg/tikz-clima-7b
- CLiMA13b: nllg/tikz-clima-13b
- LLaMA7b: nllg/tikz-llama-7b
- LLaMA13b: nllg/tikz-llama-13b
Datasets
While we provide the official version of our DaTikZ dataset on the Hugging Face Hub, we had to remove a considerable portion of TikZ drawings originating from arXiv, as the arXiv non-exclusive license does not permit redistribution. We do, however, release our dataset creation scripts and encourage anyone to recreate the full version of DaTikZ themselves.
Acknowledgments
The implementation of our CLiMA model is largely based on LLaVA.