jpype
jpype copied to clipboard
Layout of installed JPype
I was trying to debug a weird pip resolution issue and had a need to dig into the built wheel of JPype and spotted a few things that were suboptimal, and wanted to discuss.
- The existence of a
_jpypetop-level package
This one has irked me for a little while, and I haven't yet seen a good reason for it, so figure I should ask the question: "why does JPype need two top level names, _jpype and jpype?". Is this a legacy thing that was a result of it being easier to create an extension at the top-level in the olden days, or is there still a good reason for it? Even if it is renamed jpypy._jpype it would at least avoid us adding noise to the site-packages directory unnecessarily.
- The existence of
org.jpype.jarin the site-packages directory
The site-packages directory should ideally be full of top-level packages and (for better or worse) their associated dist-info metadata (uncitable I believe, so entirely subjective). Is there a good reason that the org.jpype.jar lives at the top level and not inside the jpype top-level package (is it the same reason as 1 perhaps?)?
- The existence of
jpype._pyinstallerwithout any context
I assumed this was an artifact of a non-clean build of the wheels when I found the jpype._pyinstaller directory, since it is relying on PEP420 by not providing a __init__.py and contains files that look like stuff that is still in progress (e.g. example.py has runtime print statements, tests are part of the codebase unlike the rest of the JPype tests, and for the uninitiated hook-jpype.py is unimportable and appears out of place in a package [though now I know better!]).
I don't propose changing the structure of jpype._pyinstaller (unless there is a desire to move the tests/examples to the JPype tests and examples directories, but perhaps a __init__.py with a docstring to explain some of the context would be a healthy thing from a maintenance perspective?
I believe that the _jpype is legacy, though it does serve a debugging purpose. There are many times that I have had to test an internal without wanting to include the actual source.
With regard to org.jpype.jar, this is really a question of what directory should we include to the class path? We need a package directory in site packages where we can pull jars from, not just as for our jar but for any jar file that has been installed in the site. The idea is that we should include the PYTHON_PATH as part of the CLASS_PATH. The most frequent reason (beyond the lack of a forward) is that people want to add the one jar that their jar needs to the path. Package one wants to start jpype to include the jar file it installed, package two wants to start jpype to add the one jar it installed, now there is no way to use package one with package two. This one is incomplete as we currently don't add the PYTHON_PATH to the class path, but that is the intent.
The jpype._pyinstaller was a contribution. I don't know much about it but if requires clean up then a PR with improvements would be great.
I believe that the _jpype is legacy, though it does serve a debugging purpose. There are many times that I have had to test an internal without wanting to include the actual source.
Could you elaborate? Are you saying that you don't want jpype/__init__.py to run (and import all of the other stuff) in order to test something in _jpype? I can imagine if there is a problem with _jpype then there will be a huge problem with running jpype.__init__. Is that what you mean?
The site-packages directory should ideally be full of top-level packages and (for better or worse) their associated dist-info metadata (uncitable I believe, so entirely subjective).
I found it documented at https://www.python.org/dev/peps/pep-0423/#multiple-packages-modules-should-be-rare. Since the _jpype name is private, you could argue that we don't break that rule.
We need a package directory in site packages where we can pull jars from, not just as for our jar but for any jar file that has been installed in the site.
This one is incomplete as we currently don't add the PYTHON_PATH to the class path, but that is the intent.
Good to hear the rationale. I would probably argue against putting Jars in the site-packages directory. I think I've mentioned this before somewhere, but instead we should allow packages to register a directory (either via entrypoints or custom metadata that we can read with importlib.metadata) where its Jars exist. This way we can manage the class path automatically as we can know what Jars are important to an environment without having a need to actually import all packages in the environment to find out.
In terms of location, is there not a standard location that one would put Jars in an environment prefix? Do they go in somewhere like $PREFIX/lib instead of $PREFIX/lib/python?.*/site-packages/ normally?
The jpype._pyinstaller was a contribution. I don't know much about it but if requires clean up then a PR with improvements would be great.
:+1:
Oddly I do lot of development work with _jpype without hooks installed at all. Yes that means jpype.__init__ is not run. That is my usual configuration when I am developing a new hook where the bootstrapping order needs to be resolved. Using the full system where something in the process is depending on something else in the process often leads to very difficult traces. If instead I insert stubs for each of the required hooks I can work more effectively. After all if one thing goes wrong when the first class is encountered it causes 20 things to fail and isolating it can be very challenging. Of course that is not necessarily a reason for it to be installed in that location for everyone. I can certainly make a modified version when I need it.
As for the java site packages, I am certainly open to defining how Java packages should work in the site-packages. Java doesn't have a good central location for jars and the jars are supposed to be there to support Python packages. If a Python package is installing a jar file to support the module where should the jar file go such that we don't need to define the classpath for each module that is installed. That is one of the things that what is leading to the current problems with requiring startJVM. Unfortunately, that modification is outside my scope as all of Java modules that I develop are proprietary and don't actually require interoperability. If users of JPype want to tackle that issue so that that we have some standard in which you can use more than one JPype using module in a third code it would be much appreciated. The task is pretty simple. Set up two fake Python modules that need a jar file to support their internals. Then make a module that depends on the two and resolve the startup issues so that neither calls startJVM and still find their jars properly. Placing the jars in the top level of the site packages is one solution, but perhaps there is another.
My current work load is high again so I am not getting much time to develop JPype though I do plan to cycle back to finish the forward prototype and deal with that segmentation fault on arrays. I also want to deal with the osx window inssue, but I am still waiting on approval for use of a osx development machine from my workplace.