pygraphviz
pygraphviz copied to clipboard
Wheels for PyGraphviz - advice?
I spent some time working out how to build wheels, at least for Linux and Mac:
https://github.com/matthew-brett/pygraphviz-wheels
There are two big problems:
The tests still need the graphviz binaries installed
This is because the tests, at least, need the graphviz binaries on the path. I can imagine something that would be a lot more work, that shipped binaries in the wheel, probably renamed, to avoid clashing with possible system binaries - e.g. pygv-dot
instead of dot
.
Does pygraphviz do anything useful without the binaries? Does it make any sense to ship the wheels without them?
The tests are failing with graphviz installed
The tests fail differently on Mac and Linux:
- Mac (using the new graphviz installation built for the pygraphviz wheel);
- Linux (after `apt-get install graphviz).
I guess this is because of subtle differences in the output of the commands, across versions.
Thanks for working on this.
Pygraphviz is a wrapper around the cgraph
library from GraphViz. I don't think it needs the entire GraphViz program bundled with it. But I'm not sure it is easy to pull apart the parts that it needs (the libcgraph
library and the cgraph.h
header file). It would be great to have it available as a wheel
distribution. The question is really how much work that will be.
I haven't looked at the code/tests in a long time. It is quite possible that some rely on dot
or other parts of GraphViz. I'll start looking into that and your build code now.
Thanks!
Thanks very much for taking a look. Obviously it's no problem to rely on an installed graphviz for the tests, but I was hoping it wasn't too hard to stick to internal calls to the C libraries for the non-test code, so a wheel would be standalone and not rely on any graphviz applications.
(Linux build) The tests you point to are failing in many places, but the failures are all coming from an extra space added: graph {
instead of graph {
. I don't know where to look for a cause of this. I compile it on my machine and it doesn't do that.
(Mac Build) The tests you point to are mostly coming from graph "" {
instead of graph {
.
Any ideas on what package/library would control printing of spaces or empty strings? These don't cause errors on my mac installed from anaconda. Is there a good way to set up my machine to be able to recreate what travis is doing on my local machine?
I was mistaken earlier when I said the only libraries/files needed were ``libcgraphand
cgraph.h. We also need
libcdt```.
The build isn't using Anaconda - it's using Python.org Python, and packages installed from pip. Is it possible that the build is compiling with something more recent (perhaps in the dependencies) than Anaconda is using?
OK... first thing I notice: GraphViz v2.40 has a known bug for multigraphs that has been fixed in the latest unreleased version of Graphviz. That fix will be available in v2.42, but not v2.40. So all the tests I have running locally are with GraphViz v2.38.
Can we switch to installing GraphViz 2.38? I don't think that will affect these test failures, but its a place to start.
Ouch - that's harder than it seems, because graphviz appears to provide only the latest release on its site, and building from a git checkout was bothersome, maybe because of old build tools on manylinux. But I'll give it a shot.
For NetworkX we gave up on building GraphViz from source in favor of using miniconda and its prebuilt graphviz binaries. Is that an easier route?
wget https://repo.continuum.io/miniconda/Miniconda3-4.3.21-MacOSX-x86_64.sh -O miniconda.sh
bash miniconda.sh -b -p $HOME/miniconda
export PATH="$HOME/miniconda/bin:$PATH"
hash -r
conda config --set always_yes yes --set changeps1 no
conda update -q conda
# Useful for debugging any issues with conda
conda info -a
conda install graphviz=2.38
dot -V
Making Graphviz from source worked on my Mac with git from gitlab.com
. Can't find any version 2.38 tag from gitlab.com though....
git clone https://gitlab.com/graphviz/graphviz.git graphviz
cd graphviz
./autogen.sh
./configure
./make # or cd lib/cdt ; make ; cd ../cgraph ; make # if you only want those two libraries
Yes, it works fine on Mac. The problem is with Manylinux, where the autoconf / automake / libtool versions are old.
Is there a good known commit, on graphviz master, that we could use instead?
It looks like linux from scratch has a copy of 2.38: http://ftp.oregonstate.edu/.2/lfs-website/blfs/view/svn/general/graphviz.html
And this may be helpful: https://gitlab.com/graphviz/graphviz/issues/1350
And this: https://gitlab.com/graphviz/graphviz/issues/1371
It seems like commit f54ac2c9 may correspond to the 2.38 release.
OK - I think that's right, that commit f54ac2c9 is 2.38. I got graphviz building from the repo, with a bit of fiddling, and the tests pass now, on macOS and Linux, as long as graphviz is installed:
https://travis-ci.org/matthew-brett/pygraphviz-wheels/builds/416335635
Now the easy stuff is done :) - the question is whether it is possible to use or even test pygraphviz without the graphviz binaries installed.
Nice work!
Because pygraphviz is essentially a wrapper around the libraries I don't think it can be used or tested easily without some sort of binaries present. That said, if libcgraph and libcdt are present we should be able to test it by creating a graph, adding an edge/node and then reporting the edges and nodes of that graph. I'm not sure I understand which question we need to address... my understanding is that we would need to include a version of those libraries in the wheel if we want it to work without a separate binary installation. If we want the wheel to work with a separate binary installation then we have to find that separate installation and that has caused most of the installation questions in github.
I am pretty sure I don't understand this very well - but - the libraries will be present, inside the wheel - of course they have to be, in order for the extension symbols to be available. But I guess that isn't what you meant?
I don't know pygraphviz at all - but to make the question more specific - is there any prospect of making some or all the examples work, just by calls into the extensions, and therefore to the embedded libraries?
https://pygraphviz.github.io/examples.html
Specifically, is it possible to make some of the examples work without any of the graphviz application binaries (like dot
) on the system PATH?
Sorry - to be specific - the auditwheel
and delocate
utilities find the dynamic libraries that a Python extension needs, and embed them into the wheel. So, no need to find the libraries - they will be there, but I think we currently do need to find the application binaries - like dot
.
The code itself works without the GraphViz application binaries like dot
. I guess that you are asking about the examples. The two examples in the link you show above work without dot
.
Most of the other examples only use dot
or neato
or circo
in the last 2-3 lines. They first create a graph, then create a dot file from that graph, then construct the layout and draw it using one of the GraphViz tools.
We could remove the drawing portions of those examples to make them testable. It does take away from the usefulness of the examples though. Perhaps we could leave that code as comments and then include one or two examples where those features are used more heavily (and not test those).
Am I finally getting your questions?
You are getting my questions - thanks for bearing with me.
The reason I asked about using the application binaries, was the failures here:
https://travis-ci.org/matthew-brett/pygraphviz-wheels/jobs/414015655
They are of form:
======================================================================
ERROR: pygraphviz.tests.test_attribute_defaults.test_default_attributes
----------------------------------------------------------------------
Traceback (most recent call last):
File "/venv/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
self.test(*self.arg)
File "/venv/local/lib/python2.7/site-packages/pygraphviz/tests/test_attribute_defaults.py", line 16, in test_default_attributes
assert_equal(A.string().expandtabs(2),
File "/venv/local/lib/python2.7/site-packages/pygraphviz/agraph.py", line 1278, in string
return self.string_nop()
File "/venv/local/lib/python2.7/site-packages/pygraphviz/agraph.py", line 1258, in string_nop
return self.draw(format='dot', prog='nop').decode(self.encoding)
File "/venv/local/lib/python2.7/site-packages/pygraphviz/agraph.py", line 1515, in draw
data = self._run_prog(prog, args)
File "/venv/local/lib/python2.7/site-packages/pygraphviz/agraph.py", line 1336, in _run_prog
runprog = r'"%s"' % self._get_prog(prog)
File "/venv/local/lib/python2.7/site-packages/pygraphviz/agraph.py", line 1323, in _get_prog
raise ValueError("Program %s not found in path." % prog)
ValueError: Program nop not found in path.
Tracking through that module, the _run_prog
method is used to call
-
unflatten
,tred
,acyclic
in theunflatten
,tred
,acyclic
methods, respectively; - any of
neato
,dot
,twopi
,circo
,fdp
,nop
in thelayout
anddraw
methods.
Is there any way of replacing these with library calls, do you think?
Ahhh.... I see what you are saying.
Yes, there are some functions which use _run_prog
with pipes to run these GraphViz programs using the dot files produced by pygraphviz. These are convenience functions in the sense that someone could "easily" do the same thing at the command line, but this sets up the pipes and does it all within the Python environment.
The answer (as I understand it) unfortunately is that it is essentially impossible to replace those calls with library calls. GraphViz doesn't provide that kind of interaction. There is no API for direct library calls with GraphViz.
So... it sounds like we have tools that will extract which libraries are used and include them with the wheel, but we don't have tools to automatically extract which command line tools we need available so they could be included in the wheel.
This helps me realize that while these are convenience functions, their existence means we probably should include the command line tools in the wheel too.... and that leads me to question whether we should be doing this as we'll be essentially including all of GraphViz in the wheel... rats
I'm sorry to be slow - but just to take an example. Let's say I want to unflatten my graph. I think you are saying there is no way to take the in-memory graph contained in an AGraph
instance, and unflatten it, without calling into the unflatten
binary? I guess the unflatten
binary does call into the library? Or is that not true ? Maybe the application binaries have a lot of code that is not in the library?
Yes, my understanding is that the applications binaries are primarily not using the libraries.
A quick scan:
-
nop
doesn't seem to have anything big in it; -
unflatten
,acyclic
, do have some code, but it looks like this could fairly easily be moved out into separate files; -
tred
has a little more code.
I couldn't immediately see where neato
, dot
etc were coming from.
I guess for now we could build the wheel, then say:
In order for this package to be fully functional, you should:
- Linux:
apt-get / yum install graphviz
- macOS
brew install graphviz
- Windows: install via https://graphviz.gitlab.io/_pages/Download/Download_windows.html (and we make some effort to find where the binaries went).
In the longer term, maybe we could ask the graphviz
developers to refactor the application stuff out into libraries.
Here is a pdf (https://www.graphviz.org/pdf/libguide.pdf) that describes something similar to what you want to do: use GraphViz as a library. I did not know about this capability. I'm not sure how easy it would be to rewrite the relevant parts of pygraphviz and I certainly don't have time to do that right now. Perhaps we can do as you suggest for a wheel with instructions for how to install graphviz. And then start the process of rewriting it to use the libraries.
But maybe I should dive into this a little more to see how much work is actually involved.
From the PDF, it looks as though at least the layout commands can be done from the library (section 3) and maybe the drawing (2.3). That's just at a quick read though. Maybe worth contacting the maintainers with the question?
OK... I've looked into this more and I think it is possible to:
- Use SWIG to make the graphviz calls
gvRender
andgvLayout
accessible from python. - Rewrite the pygraphviz code to use those libraries instead of using the command line tools.
This would be a fairly major upgrade/change in the code base. The tests would change too. The benefit would be ability to easily include just the libraries in a wheel distribution instead of all GraphViz.
@jarrodmillman and @matthew-brett, this will not be able to be done in a timely manner. What do you think about building a wheel that doesn't include GraphViz and telling people they need to have the GraphViz tools on the os path?
For now, we will release 1.5 without rewriting to use the library. Perhaps, we could rewrite before releasing 1.6. There is still some work needed to get the Windows wheels. Once that is done, I will release 1.5 (hopefully, before Monday 8/27).
@matthew-brett, @dschult I am starting to work on making a new pygraphviz release. It would really like to see if we can get a binary wheel for this release.
@dshult - did you make any progress in transitioning to the API? Is it a lot of work?
@matthew-brett I have basically not touched this since summer 2018. I will try to jog my memory enough to get started again. If you are up on the latest SWIG it would be great to get gvRender
and gvLayout
into the interface files to make them accessible from python. That's the first steps.
I have proof of concept code that adds the library libgvc to lib graph and uses gvRender and gvLayout to create a png of a graph laid out with dot. I think the process will be easy to generalize to the other layout tools and other image formats. Right now it is swig 3.x but I’ll try with 4.0 today.
I guess I will start by adding this functionality to the existing agraph _run_prog() methods. But I think in the long run this might replace them. What do we need to remove to be able to build the wheels? Is it enough to add the library so this new approach works? Or do we need to remove some of the old functionality?