pygraphviz icon indicating copy to clipboard operation
pygraphviz copied to clipboard

Wheels for PyGraphviz - advice?

Open matthew-brett opened this issue 5 years ago • 58 comments

I spent some time working out how to build wheels, at least for Linux and Mac:

https://github.com/matthew-brett/pygraphviz-wheels

There are two big problems:

The tests still need the graphviz binaries installed

This is because the tests, at least, need the graphviz binaries on the path. I can imagine something that would be a lot more work, that shipped binaries in the wheel, probably renamed, to avoid clashing with possible system binaries - e.g. pygv-dot instead of dot.

Does pygraphviz do anything useful without the binaries? Does it make any sense to ship the wheels without them?

The tests are failing with graphviz installed

The tests fail differently on Mac and Linux:

  • Mac (using the new graphviz installation built for the pygraphviz wheel);
  • Linux (after `apt-get install graphviz).

I guess this is because of subtle differences in the output of the commands, across versions.

matthew-brett avatar Aug 09 '18 15:08 matthew-brett

Thanks for working on this.

Pygraphviz is a wrapper around the cgraph library from GraphViz. I don't think it needs the entire GraphViz program bundled with it. But I'm not sure it is easy to pull apart the parts that it needs (the libcgraph library and the cgraph.h header file). It would be great to have it available as a wheel distribution. The question is really how much work that will be.

I haven't looked at the code/tests in a long time. It is quite possible that some rely on dot or other parts of GraphViz. I'll start looking into that and your build code now.

Thanks!

dschult avatar Aug 14 '18 16:08 dschult

Thanks very much for taking a look. Obviously it's no problem to rely on an installed graphviz for the tests, but I was hoping it wasn't too hard to stick to internal calls to the C libraries for the non-test code, so a wheel would be standalone and not rely on any graphviz applications.

matthew-brett avatar Aug 14 '18 16:08 matthew-brett

(Linux build) The tests you point to are failing in many places, but the failures are all coming from an extra space added: graph { instead of graph {. I don't know where to look for a cause of this. I compile it on my machine and it doesn't do that.

(Mac Build) The tests you point to are mostly coming from graph "" { instead of graph {.

Any ideas on what package/library would control printing of spaces or empty strings? These don't cause errors on my mac installed from anaconda. Is there a good way to set up my machine to be able to recreate what travis is doing on my local machine?

dschult avatar Aug 14 '18 17:08 dschult

I was mistaken earlier when I said the only libraries/files needed were ``libcgraphandcgraph.h. We also need libcdt```.

dschult avatar Aug 14 '18 17:08 dschult

The build isn't using Anaconda - it's using Python.org Python, and packages installed from pip. Is it possible that the build is compiling with something more recent (perhaps in the dependencies) than Anaconda is using?

matthew-brett avatar Aug 14 '18 17:08 matthew-brett

OK... first thing I notice: GraphViz v2.40 has a known bug for multigraphs that has been fixed in the latest unreleased version of Graphviz. That fix will be available in v2.42, but not v2.40. So all the tests I have running locally are with GraphViz v2.38.

Can we switch to installing GraphViz 2.38? I don't think that will affect these test failures, but its a place to start.

dschult avatar Aug 14 '18 17:08 dschult

Ouch - that's harder than it seems, because graphviz appears to provide only the latest release on its site, and building from a git checkout was bothersome, maybe because of old build tools on manylinux. But I'll give it a shot.

matthew-brett avatar Aug 14 '18 18:08 matthew-brett

For NetworkX we gave up on building GraphViz from source in favor of using miniconda and its prebuilt graphviz binaries. Is that an easier route?

wget https://repo.continuum.io/miniconda/Miniconda3-4.3.21-MacOSX-x86_64.sh -O miniconda.sh
bash miniconda.sh -b -p $HOME/miniconda                                                  
export PATH="$HOME/miniconda/bin:$PATH"                                                  
hash -r
conda config --set always_yes yes --set changeps1 no                                     
conda update -q conda
# Useful for debugging any issues with conda                                             
conda info -a
conda install graphviz=2.38
dot -V

dschult avatar Aug 14 '18 18:08 dschult

Making Graphviz from source worked on my Mac with git from gitlab.com. Can't find any version 2.38 tag from gitlab.com though....

git clone https://gitlab.com/graphviz/graphviz.git graphviz
cd graphviz
./autogen.sh
./configure
./make    #  or   cd lib/cdt ; make ; cd ../cgraph ; make    # if you only want those two libraries

dschult avatar Aug 14 '18 18:08 dschult

Yes, it works fine on Mac. The problem is with Manylinux, where the autoconf / automake / libtool versions are old.

matthew-brett avatar Aug 14 '18 23:08 matthew-brett

Is there a good known commit, on graphviz master, that we could use instead?

matthew-brett avatar Aug 14 '18 23:08 matthew-brett

It looks like linux from scratch has a copy of 2.38: http://ftp.oregonstate.edu/.2/lfs-website/blfs/view/svn/general/graphviz.html

And this may be helpful: https://gitlab.com/graphviz/graphviz/issues/1350

And this: https://gitlab.com/graphviz/graphviz/issues/1371

It seems like commit f54ac2c9 may correspond to the 2.38 release.

jarrodmillman avatar Aug 15 '18 03:08 jarrodmillman

OK - I think that's right, that commit f54ac2c9 is 2.38. I got graphviz building from the repo, with a bit of fiddling, and the tests pass now, on macOS and Linux, as long as graphviz is installed:

https://travis-ci.org/matthew-brett/pygraphviz-wheels/builds/416335635

Now the easy stuff is done :) - the question is whether it is possible to use or even test pygraphviz without the graphviz binaries installed.

matthew-brett avatar Aug 15 '18 12:08 matthew-brett

Nice work!

Because pygraphviz is essentially a wrapper around the libraries I don't think it can be used or tested easily without some sort of binaries present. That said, if libcgraph and libcdt are present we should be able to test it by creating a graph, adding an edge/node and then reporting the edges and nodes of that graph. I'm not sure I understand which question we need to address... my understanding is that we would need to include a version of those libraries in the wheel if we want it to work without a separate binary installation. If we want the wheel to work with a separate binary installation then we have to find that separate installation and that has caused most of the installation questions in github.

dschult avatar Aug 15 '18 13:08 dschult

I am pretty sure I don't understand this very well - but - the libraries will be present, inside the wheel - of course they have to be, in order for the extension symbols to be available. But I guess that isn't what you meant?

I don't know pygraphviz at all - but to make the question more specific - is there any prospect of making some or all the examples work, just by calls into the extensions, and therefore to the embedded libraries?

https://pygraphviz.github.io/examples.html

Specifically, is it possible to make some of the examples work without any of the graphviz application binaries (like dot) on the system PATH?

matthew-brett avatar Aug 15 '18 13:08 matthew-brett

Sorry - to be specific - the auditwheel and delocate utilities find the dynamic libraries that a Python extension needs, and embed them into the wheel. So, no need to find the libraries - they will be there, but I think we currently do need to find the application binaries - like dot.

matthew-brett avatar Aug 15 '18 13:08 matthew-brett

The code itself works without the GraphViz application binaries like dot. I guess that you are asking about the examples. The two examples in the link you show above work without dot. Most of the other examples only use dot or neato or circo in the last 2-3 lines. They first create a graph, then create a dot file from that graph, then construct the layout and draw it using one of the GraphViz tools.

We could remove the drawing portions of those examples to make them testable. It does take away from the usefulness of the examples though. Perhaps we could leave that code as comments and then include one or two examples where those features are used more heavily (and not test those).

Am I finally getting your questions?

dschult avatar Aug 15 '18 13:08 dschult

You are getting my questions - thanks for bearing with me.

The reason I asked about using the application binaries, was the failures here:

https://travis-ci.org/matthew-brett/pygraphviz-wheels/jobs/414015655

They are of form:

======================================================================
ERROR: pygraphviz.tests.test_attribute_defaults.test_default_attributes
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/venv/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/venv/local/lib/python2.7/site-packages/pygraphviz/tests/test_attribute_defaults.py", line 16, in test_default_attributes
    assert_equal(A.string().expandtabs(2),
  File "/venv/local/lib/python2.7/site-packages/pygraphviz/agraph.py", line 1278, in string
    return self.string_nop()
  File "/venv/local/lib/python2.7/site-packages/pygraphviz/agraph.py", line 1258, in string_nop
    return self.draw(format='dot', prog='nop').decode(self.encoding)
  File "/venv/local/lib/python2.7/site-packages/pygraphviz/agraph.py", line 1515, in draw
    data = self._run_prog(prog, args)
  File "/venv/local/lib/python2.7/site-packages/pygraphviz/agraph.py", line 1336, in _run_prog
    runprog = r'"%s"' % self._get_prog(prog)
  File "/venv/local/lib/python2.7/site-packages/pygraphviz/agraph.py", line 1323, in _get_prog
    raise ValueError("Program %s not found in path." % prog)
ValueError: Program nop not found in path.

Tracking through that module, the _run_prog method is used to call

  • unflatten, tred, acyclic in the unflatten, tred, acyclic methods, respectively;
  • any of neato, dot, twopi, circo, fdp, nop in the layout and draw methods.

Is there any way of replacing these with library calls, do you think?

matthew-brett avatar Aug 15 '18 15:08 matthew-brett

Ahhh.... I see what you are saying.

Yes, there are some functions which use _run_prog with pipes to run these GraphViz programs using the dot files produced by pygraphviz. These are convenience functions in the sense that someone could "easily" do the same thing at the command line, but this sets up the pipes and does it all within the Python environment.

The answer (as I understand it) unfortunately is that it is essentially impossible to replace those calls with library calls. GraphViz doesn't provide that kind of interaction. There is no API for direct library calls with GraphViz.

So... it sounds like we have tools that will extract which libraries are used and include them with the wheel, but we don't have tools to automatically extract which command line tools we need available so they could be included in the wheel.

This helps me realize that while these are convenience functions, their existence means we probably should include the command line tools in the wheel too.... and that leads me to question whether we should be doing this as we'll be essentially including all of GraphViz in the wheel... rats

dschult avatar Aug 15 '18 15:08 dschult

I'm sorry to be slow - but just to take an example. Let's say I want to unflatten my graph. I think you are saying there is no way to take the in-memory graph contained in an AGraph instance, and unflatten it, without calling into the unflatten binary? I guess the unflatten binary does call into the library? Or is that not true ? Maybe the application binaries have a lot of code that is not in the library?

matthew-brett avatar Aug 15 '18 15:08 matthew-brett

Yes, my understanding is that the applications binaries are primarily not using the libraries.

dschult avatar Aug 15 '18 15:08 dschult

A quick scan:

  • nop doesn't seem to have anything big in it;
  • unflatten, acyclic, do have some code, but it looks like this could fairly easily be moved out into separate files;
  • tred has a little more code.

I couldn't immediately see where neato, dot etc were coming from.

I guess for now we could build the wheel, then say:

In order for this package to be fully functional, you should:

  • Linux: apt-get / yum install graphviz
  • macOS brew install graphviz
  • Windows: install via https://graphviz.gitlab.io/_pages/Download/Download_windows.html (and we make some effort to find where the binaries went).

In the longer term, maybe we could ask the graphviz developers to refactor the application stuff out into libraries.

matthew-brett avatar Aug 15 '18 16:08 matthew-brett

Here is a pdf (https://www.graphviz.org/pdf/libguide.pdf) that describes something similar to what you want to do: use GraphViz as a library. I did not know about this capability. I'm not sure how easy it would be to rewrite the relevant parts of pygraphviz and I certainly don't have time to do that right now. Perhaps we can do as you suggest for a wheel with instructions for how to install graphviz. And then start the process of rewriting it to use the libraries.

But maybe I should dive into this a little more to see how much work is actually involved.

dschult avatar Aug 15 '18 18:08 dschult

From the PDF, it looks as though at least the layout commands can be done from the library (section 3) and maybe the drawing (2.3). That's just at a quick read though. Maybe worth contacting the maintainers with the question?

matthew-brett avatar Aug 15 '18 18:08 matthew-brett

OK... I've looked into this more and I think it is possible to:

  • Use SWIG to make the graphviz calls gvRender and gvLayout accessible from python.
  • Rewrite the pygraphviz code to use those libraries instead of using the command line tools.

This would be a fairly major upgrade/change in the code base. The tests would change too. The benefit would be ability to easily include just the libraries in a wheel distribution instead of all GraphViz.

@jarrodmillman and @matthew-brett, this will not be able to be done in a timely manner. What do you think about building a wheel that doesn't include GraphViz and telling people they need to have the GraphViz tools on the os path?

dschult avatar Aug 16 '18 16:08 dschult

For now, we will release 1.5 without rewriting to use the library. Perhaps, we could rewrite before releasing 1.6. There is still some work needed to get the Windows wheels. Once that is done, I will release 1.5 (hopefully, before Monday 8/27).

jarrodmillman avatar Aug 22 '18 20:08 jarrodmillman

@matthew-brett, @dschult I am starting to work on making a new pygraphviz release. It would really like to see if we can get a binary wheel for this release.

jarrodmillman avatar Jul 14 '20 17:07 jarrodmillman

@dshult - did you make any progress in transitioning to the API? Is it a lot of work?

matthew-brett avatar Jul 15 '20 18:07 matthew-brett

@matthew-brett I have basically not touched this since summer 2018. I will try to jog my memory enough to get started again. If you are up on the latest SWIG it would be great to get gvRender and gvLayout into the interface files to make them accessible from python. That's the first steps.

dschult avatar Jul 15 '20 19:07 dschult

I have proof of concept code that adds the library libgvc to lib graph and uses gvRender and gvLayout to create a png of a graph laid out with dot. I think the process will be easy to generalize to the other layout tools and other image formats. Right now it is swig 3.x but I’ll try with 4.0 today.

I guess I will start by adding this functionality to the existing agraph _run_prog() methods. But I think in the long run this might replace them. What do we need to remove to be able to build the wheels? Is it enough to add the library so this new approach works? Or do we need to remove some of the old functionality?

dschult avatar Aug 05 '20 12:08 dschult