Publish to PyPI
We've had a little think about how this could be done, have you thought any more about it? Here's a couple of ideas...
If we break the project into two components (which it already is; libhadoofus and hadoofus) this becomes much easier. The former is required to be installed on the system prior to attempting to install the python library via pip. This is the same approach re2 takes. In my opinion I don't think it's too much to ask... it should be just as simple as git clone ... && cd src && make install.
Side note: Have you considered renaming src to something less generic? Perhaps libhadoofus?
Since there's no package uploaded to pypi yet, it's required that you use pip v1.5+ to allow the use of the git+git://...#subdirectory=wrappers/py notation. Prior to this version, pip expects that you have a setup.py file at the root of the repository, which isn't the case here. I don't think it should be, either.
Installing into a virtualenv
The shell script below assumes you already have libhadoofus installed into your library path. In theory, this should work for you. However I've only tested it on OSX 10.9.1 so there might be no need to pass CFLAGS/CPPFLAGS, just a proof of concept.
$ virtualenv env
$ source env/bin/activate
$ pip install --upgrade pip # Ensure we get the latest pip
$ CFLAGS="-Qunused-arguments" CPPFLAGS="-Qunused-arguments" pip install cython
$ CFLAGS='-fno-strict-aliasing' pip install -e "git+git://github.com/duedil-ltd/hadoofus.git@d7ae3454a27f2c4ba3ecc6494d14f5a9f3c73b4b#egg=hadoofus&subdirectory=wrappers/py"
I'm using our fork in the example since it has the patch https://github.com/duedil-ltd/hadoofus/commit/d7ae3454a27f2c4ba3ecc6494d14f5a9f3c73b4b applied, plus the fixes to compile on OSX Mavericks. This patch makes sure pip locates the .pyx source files correctly, if this isn't done it looks for the file in the root of the cloned repository (presumably the working directory of python) and throws an error because it doesn't exist.
Ping @icio – Since we were discussing this earlier, any thoughts or corrections?
I'm afraid I haven't put very much thought or effort into this issue. I'm really happy you've taken a stab at it.
It sounds fine to me to require the C library (and headers) installed or available before the Python library is installed. Can we detect this from setup.py in any reasonable way? If we split the package, may I suggest "Pydoofus" for the Python wrapper name? :-) I'm not set on that name, but I would like the Python wrappers to have 'py', or 'python-' in the name somewhere.
Is there a reason to rename src to something less generic? The file layout and build system could use some work — I'm a little embarrassed by the cobbled-together Makefile system — but I can't think of a reason to rename that directory (instead of, say, removing it and include and moving the sources up a level).
I only have pip 1.4.1, and Fedora is usually fairly up-to-date. Can we add a setup.py to the root of the repository that indirects to the wrappers directory to avoid the hard dependency on 1.5? I'm afraid I'm still pretty unfamiliar with Python packaging. And just to clarify, is 1.5 only required at packaging time for PyPI? Or is it also required by end-users?
Thanks! Conrad
I've used a few python modules that compile C extensions in them and they usually don't require individual steps by the developer, which is what I think should be aimed for here. Supporting make [install] would still be very nice -- especially if there's going to be a C lib built alongside the python wrapper.
The Cython documentation covers distributing Cython modules and it suggests that you distribute the C code such that you don't require Cython to compile the python module. That should make setup a little more straight-forward. Otherwise, setuptools supports an setup_requires directive which might run before building extensions (I've never tried) and offer a way to ensure Cython is installed prior to building.
For repository organisation: swings and roundabouts. I'd be tempted to go for something like /src/{python/hadoofus,c/{hdfs,hadoofus}}/ and then reference everything from top-level Makefile and setup.py files -- the former for installing the C extensions and libraries, the latter for installing the python module (including compiling the hadoofus extension, if necessary). I've had a dig around Github search for some neat examples of projects compiling Cython modules and not come up with anything great.
(Updated: new setup.py merged in #14 tackles some of these ideas)
Hm. If we actually cut tarballs, it might be reasonable to embed pre-Cythonified hadoofus.c. But I'm opposed to checking in generated code. The broad shape of everything you've said sounds good.
What's the issue you've had with the dynamic library or linking?