pgf
pgf copied to clipboard
external library rebuilds figures when figure order changes
When using \usetikzlibrary{external}, the externalized pictures get written to files named like <main file name>-figure<number>. As I just discussed with @hmenke, it would be useful if files could optionally be named by the MD5 sum (which is already calculated to detect changes). That would eliminate the need to rebuild unchanged figures, e.g. when reverting to a previous version of a figure, when changing figure order, or when inserting a new figure between existing ones. I am aware that this can cause the number of files to get quite large as old ones are never deleted/overwritten, so the default behavior should not be changed.
This is actually harder than I had anticipated, because the code of the externalization library is mostly unreadable gibberish.
What about scan the current (or designated) folder for md5 file with the same md5 sum before building, if it finds a match, it can use the figure with the same name of the md5 file, if not, it switches to building process? Renaming the figure by md5 sum could work but I think since there're already .md5 files, why not make use of them? And the figure name won't be too long to quote inside the tex file.
@mukron: It rebuilds, because the automatic file names change, e.g. from main-figure0 to main-figure1and vice versa if you switch the first two. Using \tikzsetnextfilename{⟨file name⟩} should avoid the rebuild. I did not test this.
Is there any advanced on this issue? It would be great to have a fix, to me it makes externalize more or less unusable (it takes 14mn37 to compile my documentation with externalize vs 42s without)... Of course, the rebuild is very quick (3 seconds), but this is counter balanced by the fact that I need to recompile everything if I'm writing stuff at the beginning of the file.
@tobiasBora did you find a workaround for this issue? thank you
@bf the simplest (but unperfect) workaround is to add \tikzsetfigurename{nameprefix} from time to time (changing the nameprefix to random values) so that when you add a picture before, it only recompiles nameprefix. I'm not sure if it would be possible to automatically add \tikzsetfigurename{<hash of the image>} before a new image (if it's the case, it certainly requires some sort of rescan dark magic as I tried here to rename the environment and use a macro for figures instead of an environment)… but in my humble opinion, external has quite a lot of bugs (bad design?) that make it annoying to use (see e.g. https://github.com/pgf-tikz/pgf/issues/1137 or the list of issues I gave here in section 4.5), for instance the bigger your document is, the longer it takes to compile, up to the point where it's not worth using external for too big documents and a full rewrite (maybe independent of tikz) would be certainly much better… and potentially easier.
Sadly, I don't have much time now to write my own library ^^
@bf because I was tired of this, I ended up creating my own library (in the meantime I realized today that there is also this project, that seems to try to solve a similar issue with a quite different approach.
You can find the documentation of my library here: https://raw.githubusercontent.com/leo-colisson/robust-externalize/master/doc/robust-externalize.pdf
Thanks @tobiasBora I've settled on an approach using the MD5. But the memoization looks much more elegant.
What do you mean by using the md5? The library memoize mentionned above seems interesting and elegant, but I’ve never used it as it seems to be quite specific to latex code. My library robust-externalize is quite reliable in my experience, and I used it to cache videos/python code etc… in a really flexible way (e.g. I can pass data like page number to a python script that will extract frames from a video and add the page number below to include it back to the pdf, adding the current number of frames to the page number etc… all of this being cached and updated whenever the page or video changes) but I plan to simplify significantly the interface (everything will be considered as a placeholder, and people can create new placeholders and append stuff to it). So if you use the current version, make sure to copy the .sty in your project (anyway it’s not yet on CTAN) as I will make significant changes in september, and keep in mind that the interface will be simplified a lot soon.