Embed images in SVG output
[Update 2020-11-16]
This PR has evolved from simply fixing how relative image paths are resolved, to embedding any images in the SVG file by default, eliminating the original issue.
Previously, paths to images were resolved relatively to the specified output file instead of the input file.
This produces errors when calling WireViz with the -o flag, creating output files in a different location from the input.
This PR [partially, see comment below] fixes the problem.
Additionally, it eliminates the need for passing a gv_dir variable to the Image class, resolving paths beforehand and passing the absolute path to the Image class.
It seems GraphViz only links to the image in the SVG output. This might be problematic if the output is stored in a different directory from the input, or moved afterwards. Further investigation is required.
@kvid I'd be interesting in your input on this topic. The solution would be complete if it wasn't for the fact that the SVG, and by extension, the HTML output, depend on having a working link to the original image file.
The .gv file is also affected, but this should be less of a problem for the average user, since this is just an intermediate file.
It seems that GraphViz does not offer a way to embed images?
Here are some new discoveries that might fix the above mentioned problem with embedded images.
- HTML
<IMG>tags allow embedding base64-encoded images within theSRCattribute, avoiding the need to link to external files: example 1; example 2 (encoded by me using base64-image.de) - GraphViz does not seem to support this. A quick test (using example 1 above) reveals that
produces an error:attribs['image']['src'] = 'data:image/png;base64, iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg=='graphviz.backend.CalledProcessError: Command '['dot', '-Tpng', '-O', 'ex08']' returned non-zero exit status 1. [stderr: b'Warning: No such file or directory while opening data:image/png;base64, iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==\nError: No or improper image file="data:image/png;base64, iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg=="\nin label of node Key\nWarning: No such file or directory while opening data:image/png;base64, iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==\nError: No or improper image file="data:image/png;base64, iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg=="\nin label of node W1\n'] - Inserting the data manually into an SVG file's
<image xlink:href="">attribute works! 🎉 Here isex08.svgwith bothxlink:hrefattributes containing the original images in base64 encoding. Note that the image renders correctly, even though the gist does not include any bitmap files. - This would fix the problem for SVG, and by extension, HTML output. GraphViz output can't be fixed, but I see this as a minor problem.
- More about data URIs
@kvid do you think it's a reasonable approach to convert and inject the images into the output SVG file in this way? If so, this PR should only fix the relative path issue, and I'll begin another one to work on the encoding.
I'm a bit busy these days, and are not able to contribute as often as I want.
Here are some new discoveries that might fix the above mentioned problem with embedded images.
- HTML
<IMG>tags allow embedding base64-encoded images within theSRCattribute, avoiding the need to link to external files: example 1; example 2 (encoded by me using base64-image.de)
That looks useful, but I don't think it's much in use. I've never seen it in an HTML page.
GraphViz does not seem to support this. A quick test (using example 1 above) reveals that
attribs['image']['src'] = 'data:image/png;base64, iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg=='produces an error:
graphviz.backend.CalledProcessError: Command '['dot', '-Tpng', '-O', 'ex08']' returned non-zero exit status 1. [stderr: b'Warning: No such file or directory while opening data:image/png;base64, iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==\nError: No or improper image file="data:image/png;base64, iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg=="\nin label of node Key\nWarning: No such file or directory while opening data:image/png;base64, iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==\nError: No or improper image file="data:image/png;base64, iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg=="\nin label of node W1\n']Inserting the data manually into an SVG file's
<image xlink:href="">attribute works! 🎉 Here isex08.svgwith bothxlink:hrefattributes containing the original images in base64 encoding. Note that the image renders correctly, even though the gist does not include any bitmap files.This would fix the problem for SVG, and by extension, HTML output. GraphViz output can't be fixed, but I see this as a minor problem.
@kvid do you think it's a reasonable approach to convert and inject the images into the output SVG file in this way? If so, this PR should only fix the relative path issue, and I'll begin another one to work on the encoding.
I'm a bit busy these days, and are not able to contribute as often as I want.
Don't worry about it, I know the situation.
You don't seem to have an e-mail address associated with your GitHub account (besides [email protected], which I assume won't work)... is there a way for me to send you a direct message? Please let me know via [email protected].. thanks!
The different approaches each have their advantages and disadvantages:
- When specifying an output location different from the input, all image files need to have absolute paths or optionally a modified relative path (if possible) or optionally a copy of the image files can be made at the output location. Converting to absolute paths will not work if the output location is not the final destination. I guess the user should have all 3 options.
- Some users might find it useful to allow a data URI as image source input, but it must be converted to a file for Graphviz to find it, and as in the previous point, absolute path or relative path must be selected.
- Some users might find it useful to generate a data URI in any image source of the SVG output, but I guess it should be optional per image independently of the options above (and maybe a harness option to be used when not specified for each image). The clean way is to make this a new output (e.g.
.data.svg) to avoid renaming or caching the Graphviz generated.svgbefore overwriting the same output name with the modified version containing data URIs.
Ideally, the output files should be self-contained, independent of any paths to other files. Anything else just causes more trouble, and the added effort of specifying behaviors (as per the first bullet point of your post) is not really worth it IMHO.
[Edit] The closest analogy I can think of is including schematic symbols and footprints in an EAGLE project. Once a library component has been included in a project, the project files can be shared without having to also share the libraries, since everything has been embedded within the schematic/PCB file itself. I find this immensely useful and would therefore like to replicate this behavior.
Therefore, my final goal is to:
- allow user to specify paths to images inside the YAML, relative (to the YAML file) or absolute
- embedding data URIs in the YAML was the idea behind #188 but I don't see it as a priority
- resolve relative image paths to absolute ones when parsing the YAML (this PR)
- generate BOM, PNG and SVG output files
- modify SVG output, embedding the images in the
<image xlink:href="">tags as described above using base64 encoding.- I don't really see the need for keeping the SVG with the reference to an external file, except for debugging purposes.
- generate HTML output, containing the modified SVG
As you can see, I'm trying to follow the KISS principle instead of providing too much configurability... I'd rather implement this straightforward (implementation-wise), but hopefully foolproof (from the user's perspective) method, and eventuall add more output options later.
Embedding images in the output SVG works :tada:
It has been tested with PNG, JPEG and animated GIFs. (To see the working animation, click the Raw button in the Gist, download the file, open in browser.)
@kvid, this PR is ready for review. Feel free to mark my comments as resolved if you agree, and let me know if you don't. I am open to discussing your ideas regarding finer user control over how to handle images (embed or not, copy to output directory, ...) as a separate PR, but I would first like to have embedded images as default since it leaves no open questions (see my comment above). Thanks!
Thanks for your suggestions. Don't worry about not responding earlier.. there is no obligation here. I've included most of your suggestions and commented on the ones that need discussion/clarification.
A major drawback with resolving all image paths to the absolute path, is that the output file contents will depend on the absolute path that normally differ between developers, and thereby lead to false diffs when doing regression testing.
The way I see it, this would only affect .gv files.
- Is it worth thinking about a solution in code to output a version of the
.gvfile with relative paths (for the diff), and one with absolutes? - Is it possible to somehow ignore certain patterns (i.e. only the
srcattribute of an<img>tag) while doing the diff?- This could be a task to include in #63 ...
Once #60 is finally implemented, .gv will likely not be used by most people outside of WireViz development.
A major drawback with resolving all image paths to the absolute path, is that the output file contents will depend on the absolute path that normally differ between developers, and thereby lead to false diffs when doing regression testing.
The way I see it, this would only affect
.gvfiles.
I agree.
- Is it worth thinking about a solution in code to output a version of the
.gvfile with relative paths (for the diff), and one with absolutes?
This is probably the easiest to implement.
Is it possible to somehow ignore certain patterns (i.e. only the
srcattribute of an<img>tag) while doing the diff?
- This could be a task to include in #63 ...
This would probably need to involve temporary renaming, create modified version, doing diff, and rename back for each .gv file, and something similar for a custom git add function to create the comparing base. It seems more complicated than your first suggestion.
Once #60 is finally implemented,
.gvwill likely not be used by most people outside of WireViz development.
I don't think it is used much by most people outside of WireViz development today either...
What would be the best way of implementing separate .gv files?
Some thoughts:
- Image paths are resolved in
wireviz.py -> parse(), since this is the context where the base path is known.- I propose reading the
image.srcattribute here, and placing the absolute path in a newimage.src_absattribute.
- I propose reading the
- The content for the
.gvfile is generated inHarness.py -> create_graph(), which useswv_gv_html.py -> html_image()- How about a flag for
create_graph()to choose whether to useimage.srcorimage.src_abswhen callinghtml_image()?
- How about a flag for
- The actual
.gvfile is output via theHarness.py -> output()function.- It feels kind of wasteful to call
create_graph()twice just to get the two file variants. - One option is to use the
fmtparameter to decide which file to generate.gv_relandgv_abscould be two of the allowed options and trigger the respective output, a file named*.rel.gvor*.abs.gv. This way, while developing, both files can be generated, while no double effort is wasted during normal use.
- It feels kind of wasteful to call
What do you think?
I'm sorry about this late answer. I agree with the suggested way of implementing separate .gv files. It seems like a reasonable approach.
A similar approach could also be used to enable an optional .html output with an ordinary image tag instead of embedding the .svg file. Then it will be possible to do regression testing on that file too.
Edit: Is it possible to use the same string for fmt as in the output file extension? (Skipping the leading dot in fmt is OK.)
A major drawback with resolving all image paths to the absolute path, is that the output file contents will depend on the absolute path that normally differ between developers, and thereby lead to false diffs when doing regression testing.
The way I see it, this would only affect
.gvfiles.I agree.
I am working on a refactoring of the WireViz CLI in #244 which will finally close #60 and will stop outputting .gv files by default.
Since you agree that .gv is rarely needed outside development, would you be OK if I implement your other suggestions, ignore the fact that absolute paths will be visible if .gv output is enabled, and we get this PR ready?
I am sure we can then create a follow-up PR to elegantly deal with the case where .gv is generated.
I am using smart_file_resolve() to determine relative image paths.
The current strategy does not consider whether the relative path is defined in the main YAML, or the prepended YAML. In both cases, it searches relative to the main YAML's directory first, and tries relative to the prepended YAML only if unsuccessful.
I realize it is not the absolute cleanest solution, but it is simple, deterministic, and should work in most cases... and might even have some advantages.
Rebased on top of latest to profit from smart_file_resolve().
Old branch with full commit history is backed up as backup/bugfix/image-src_before-latest-rebase