jep
jep copied to clipboard
Distribute jep as a wheel
I'd like to use jep in my application, but I'd need a compiled version to include in the jar alongside the python interpreter. I considered trying to download and extract the wheel at runtime, until I realized you didn't have one. Is there any obstacle other than changing from distutils to setuptools that would prevent you from uploading jep as a wheel? If there are problems, could you please tell me how could I help fix them? I already understand that it'd be hard to compile it both with and without numpy support, but I think you should get basic support for wheels out, even if they don't support numpy. I've already tried to turn jep into a wheel on my system, and it seems to work just fine, though I could be missing something.
It's certainly something we should solve, it's just not a priority. It was a small feat to get the build working as well as it does today, and getting wheels working is more complex. I also got a wheel working but it was platform specific and I didn't want to put one up that only fit that platform. Also for the time being I was trying to avoid a dependency on setuptools.
Where do we draw the line on scope?
https://pypi.python.org/pypi/numpy/1.12.0rc2 At the time of writing of this comment, numpy has 22 wheels for different platforms and versions of python. I would not mind dropping support for Python 2.6 and 3.2, and dropping support for 32-bit. Not sure how others feel about that. That still leaves a lot of wheels though.
Also how many versions of numpy do we support? Or just the latest at time of a Jep release? Or maybe all the wheels are without numpy? I think a lot of Jep users are using the numpy features.
How do you envision this working? We also have the requests for Maven support, which I was delaying until Jep 4, as the package name would most likely change to org.____.jep. Would someone get the wheel and the jar is pre-compiled with the wheel? If so, where does the jar install to? Would someone get the jar from Maven and the jar contains the wheels and somehow selects the right wheel to extract and install? That would be cool but would increase the jar size and it's non-trivial to make that happen. Or would it be two separate steps, someone would use Maven to get the jar and pip to get the wheel?
Above all the wheels would need to be streamlined. Since @bsteffensmeier and I are the only active contributors at present, I'm not in a rush to add more complications to every release. It's already slow for me to test on different platforms at release time, we need to get more of that automated and I don't test every possible variation, just a set of virtual machines. If wheel support is added, it's that much more release work unless we have a volunteer or can get it fairly automated.
I'm willing to get wheels working for python 3.5 and 2.7 on windows and linux if you'd accept a PR.
As for which wheels you're going to distribute, I feel it's only necessary to do it for the common and publicly supported versions of python, 3.3, 3.5 and 2.7, without numpy support. Although you should retain support for other versions compiling by hand, I feel we should get basic support out for non-numpy, common python versions. As for automation, I do know that travis-ci supports automated building of wheels and deployment to pypi, and should be relatively easy to setup after we add support to the build system.
Personally, maven support was my primary intention for the wheel system, since I need to integrate jep into my build setup. My plan was to build a sort of 'embedded jep', that includes the python 3.5 interpreter inside the jar. However, that wouldn't work well for all use cases, since it'd produce a very large jar, and including a python interpreter is somewhat outside the scope of the project.
Instead, you could produce multiple maven artifacts, one for each version of python and platform, alongside a primary module, and an embedded support module.
The primary module would contain the core logic the jar contains now, and the user would be expected to have already loaded jep onto the classpath in some way. This gives the user complete control over installing jep, and supports any use case they have while allowing them to integrate jep's classes into their IDE and build system. The embedded support module, would add logic to locate and load the python interpreter, and locate and load the jep wheel from either the classpath, python's package directory, or a user specified location. This supports the vast majority of use-cases, while still giving the user some control over how they include jep. You'd also include platform/interpreter-specific modules, that are simply thin wrappers around their respective wheels, so jep can find them on the classpath if the user desires an completely embedded jar.
All platform specific modules could be bundled together into a single jep-embedded-all
artifact (like netty has). From the user's point of view, they're just including a single maven artifact, that does the hard work for them at runtime.
For my use case, I'd simply make my own artifact, that includes the python 3.5 interpreter for each platform, and depends on the respective platform specific modules, without having to include a large monolithic jar that contains all possible wheels.
However, this is a ton of work, and is far more than I need for my use case. Regardless, wheel support is a prerequisite to any of it, and I'm definitely willing to get that done if you'll accept it.
In short, you think that we should release a wheel for Python 2.7 and latest Python (currently 3.5), without numpy support, 64-bit only, latest Jep only, for each of the three major operating systems. (I could probably cover OS X). So 6 wheels total?
And you're offering to make a pull request for the change necessary to the setup script to produce wheels, and to also upload the wheels? I'm somewhat concerned about how often we'd have to upload wheels and if we'd need to test the wheels. Python 3.6 will be out soon, or would we try to match the latest standard version of Python that ships with the current Ubuntu? The Jep release schedule is rather undefined (patch release for stability issues, all other changes in next minor release which has no set date).
I need to look into using travis-ci more. It appears it doesn't support Windows and has limited support for OS X. Also it wasn't clear to me if I could make a Python project that had Java installed with it. I don't see anything in the docs blocking that idea, but I doubt anyone has done that before.
Setting up a build system with Jenkins CI is fairly trivial, the only problem is the cost of hosting the slew of OS's, especially OS X (not too many offered out there).
You could easily use Docker containers on a single windows host for Windows + (Multiple Linux distros).
The OSX86 Project might also work but that may have legal implications.
Also, I wouldn't recommend letting anyone you can't verify upload binaries for you.
In a month or so I can setup a build server, if someone wants to submit a PR with the upgrades to setup tools then it saves me work.
Note that the delay is because I just had my first child so no sense in wasting money on a Windows Server VM until I will actually use it.
I will provide access to Jenkins, which will be configured to pull from your repo based on webhook, so you control when builds happen and what code goes in.
If anyone can provide a VM Faster I can get started, I will need Windows Server 2016 in order to do builds on a windows and linux. OS X I'd suggest email me and we can discuss options.
I really like this project, and am even building my own OSS project featuring it, when work is underway I will need a build machine so I am happy to share once I have to use the service myself.
A couple of thoughts:
I had been avoiding depending on setuptools since it didn't come with the standard library, even though most Python packages require it. I was trying to keep dependencies at a minimum, though given setuptools' widespread use throughout the Python ecosystem maybe I should reconsider. We could also possibly overcome my aversion with something like:
try:
from setuptools import setup
except ImportError:
from distutils.core import setup
Right now we're using TravisCI and Appveyor for testing purposes. Ideally we'd like to get to a point where when we release the following would happen with a single command:
- Tag version in github.
- Upload to pypi.org.
- Upload to maven central.
- If major or minor release (not patch release), upload javadoc to github io page.
Right now those steps are manual.
numpy has OS X (macOS) wheels so somehow they figured out mac building, unless someone is manually creating the wheels on a personal machine. I know we have some BSD users so ideally we'd have a wheel for that too.
Other big concern is numpy inclusion or not. With absolutely no numbers to back this up, I suspect that the majority (> 50%) of our users are using Jep with the numpy support built to enable passing NDArrays back and forth between Java and Python. Reading PEPs 425, 427, and 491, I see no way to name a wheel to indicate if it was built with numpy or not, or the numpy version. So are you proposing the wheels would be for Jep compiled without numpy support and those who wanted numpy support would run pip with the no binary or no wheel options?
Or can we build a project for pypi named jep-numpy that was compiled with numpy support? If we built with numpy support what version? I'm not sure how compatible different numpy releases are if we built with one version and a different version was installed. Here are some mentions of the issues with that:
- https://github.com/numpy/numpy/issues/5888
- https://github.com/ContinuumIO/anaconda-issues/issues/6678
- "If you build against an older numpy version, it is forwards-compatible with newer numpy versions. The inverse is not true."
Alternatively we could host wheels with different numpy versions somewhere, they just wouldn't be on pypi with numpy support. I'd argue the wheel naming conventions need to be more advanced and flexible, but I'm not going to write the PEP on that.
I did not intend to discourage the idea of wheels if I came across that way.
Further thoughts:
- I think we should drop numpy support for the wheels. Keep it simple. If you want numpy support, you must build from source. Along with that we may want consider adding code to Jep to try and detect a scenario where someone uses the class NDArray when their native library does not have numpy support, and provide a warning message pointing that out.
- Reading more on setuptools, I should get over my aversion to it, everyone else in the Python ecosystem uses it. If we provide wheels, I am fine with using setuptools. If we do not provide wheels, I would like to attempt to remain with distutils.
- Wheels for Windows would be most useful, as that seems to be where developers have the most trouble building it (surprise). That would save developers from jumping through hoops to get the right MSVC compiler. Linux wheels should be easy. We should aim for macOS wheels, asking other projects how they managed it, because macOS Jep users sometimes have issues with the build. If a BSD wheel is not difficult, we should add it for completeness.
I can grant pypi access if a volunteer is willing to join the project and build and manage the wheels.
Nate,
I agree with all sentiments here. Also, having a bundled binary would allow me to use jep without numpy support in something like gradle (as a build requirement) to build a custom Jep package.
The above goes against using the "right tool for the job"; rather it enables the use of python for serious py dev's to break into the java world.
In any capacity- it's trivial to setup builds for any supported OS, it's just a matter of options available at no charge are limited.
Setuptools will also help a little with building on Windows assuming the correct compiler is installed for the python version as it has better detection of compilers - ie it should work without manually loading vcvars.bat (if I am not mistaken)
I am planning on setting up a Mac server soon with ESXi (allowing me to issue small build machines ad-hoc with Jenkins), We can discuss whenever that happens IF I have the capacity to donate any resources to you, that would cover any Windows, Linux, BSD, OS X, etc (any x86 OS anyway)
Since I will be managing the system you would only need access to Jenkins to setup the build hooks etc.
Further to that- you could optionally setup multi-jdk builds if you needed to, ie build with OpenJDK 8 for OpenJDK8, etc.
Actually- if the wheels do happen I would be in a far better position with what I want to do with JEP! So, yeah wheels has my vote, I need Py2.7 and Py3.7 personally; no need for numpy support for my wheels use case.
Sorry for yammering..
Regards,
Michael
On Sep 17, 2018, at 12:02 PM, Nate Jensen [email protected] wrote:
I did not intend to discourage the idea of wheels if I came across that way.
Further thoughts:
I think we should drop numpy support for the wheels. Keep it simple. If you want numpy support, you must build from source. Along with that we may want consider adding code to Jep to try and detect a scenario where someone uses the class NDArray when their native library does not have numpy support, and provide a warning message pointing that out. Reading more on setuptools, I should get over my aversion to it, everyone else in the Python ecosystem uses it. If we provide wheels, I am fine with using setuptools. If we do not provide wheels, I would like to attempt to remain with distutils. Wheels for Windows would be most useful, as that seems to be where developers have the most trouble building it (surprise). That would save developers from jumping through hoops to get the right MSVC compiler. Linux wheels should be easy. We should aim for macOS wheels, asking other projects how they managed it, because macOS Jep users sometimes have issues with the build. If a BSD wheel is not difficult, we should add it for completeness. I can grant pypi access if a volunteer is willing to join the project and build and manage the wheels.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
Oh - I need about a month for that server, I can setup automatic wheel uploads.
Based on what I've seen on github history I am not worried about resources.
I'd be happy to setup wheels! I have a Windows Server license which allows for virtualization which is recent enough to install all the required build tools.
All I'd ask for in return is help on Jep from the community- which I know I have as is, so yeah.
Let me know, it's an easy way for me to contribute to the project as well as maybe get your ear so I can finish setting up the ability to make a fat jar which is compatible with Jenkins.
I've worked out a few different scenarios where it may work but that's a new topic i'll send shortly.
Regards,
Michael Ruggiero
I did not intend to discourage the idea of wheels if I came across that way. Further thoughts:
I think we should drop numpy support for the wheels. Keep it simple. If you want numpy support, you must build from source. Along with that we may want consider adding code to Jep to try and detect a scenario where someone uses the class NDArray when their native library does not have numpy support, and provide a warning message pointing that out. Reading more on setuptools, I should get over my aversion to it, everyone else in the Python ecosystem uses it. If we provide wheels, I am fine with using setuptools. If we do not provide wheels, I would like to attempt to remain with distutils. Wheels for Windows would be most useful, as that seems to be where developers have the most trouble building it (surprise). That would save developers from jumping through hoops to get the right MSVC compiler. Linux wheels should be easy. We should aim for macOS wheels, asking other projects how they managed it, because macOS Jep users sometimes have issues with the build. If a BSD wheel is not difficult, we should add it for completeness. I can grant pypi access if a volunteer is willing to join the project and build and manage the wheels.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.