Allow ROOT to work with PyPy
Is your feature request related to a problem? Please describe.
I'd like to use fast Python with ROOT bindings. Why do I want that? I like ROOT trees, they are rather good containers, and can be very interactive with TreeViewer. I almost never use numpy, scipy or pandas (though it is said that recent PyPy versions support them). I use C++ for hard calculations (like 3-dimensional fits), but mostly I use pure Python for high-level data analysis, and it would be useful to combine PyPy and ROOT for middle-hard calculations. I'm also developing an architectural framework for data analysis in Python, and I'm curious whether PyPy + ROOT will ever be possible (it would be a good combination for speeding up Python).
I'm interested in the possibility to use PyPy with ROOT. When I launched that, there was an error
$ pypy >>>> import ROOT Traceback (most recent call last): File "
", line 1, in File "/opt/root/cur/lib/ROOT/__init__.py", line 22, in import cppyy File "/opt/root/cur/lib/cppyy/__init__.py", line 64, in libcppyy_mod_name, major, minor)) ImportError: Failed to import libcppyy2_7. Please check that ROOT has been built for Python 2.7
However, I allowed Python 2 support in ROOT, and python2 imports ROOT fine.
A similar error was for Python 3, but Python 3 version for my PyPy and ROOT installations were different, so I checked that with Python 2.
Describe the solution you'd like
Allow PyPy to use ROOT.
Describe alternatives you've considered
I think that PyPy is the most general fast Python implementation (I didn't use other ones and not sure whether they will work with ROOT).
I've heard about numba, but it looks like it's mostly useful with numpy.
Jython didn't run when I tried it with tox.
Additional context
I've sent a letter to PyPy developers, and this is what they replied:
ROOT uses a fork of cppyy with several local modifications (and it's also very much behind cppyy master). One of them is precisely this:
> ImportError: Failed to import libcppyy2_7. Please check that ROOT has been built for Python 2.7
Which is referring to their "multi-python" build, a local "invention." But also, libcppyy is not used in PyPy and should never be loaded directly for portable use. (If they had stayed with standard Python platform tags, they would not have to load it explicitly.)
> Could you please add support of ROOT in PyPy?
This really is on the ROOT folk to make their fork compatible with the Python ecosphere, so file a bug report with them.
There was a discussion on ROOT forum in 2019 (https://root-forum.cern.ch/t/how-to-start-pypy-with-root-module), but it didn't give any direct solution to the issue, and maybe something has changed since then.
I believe PyPy has its own cppyy builtin module and PyROOT works on top of its own fork of cppyy, for which we haven't tested (or so far aimed for) compatibility with PyPy.
Regarding the error you see, please check if the ROOT installation you are using has been built for that Python version (2.7). It could also be some environment/installation issue. How did you install ROOT?
On the other hand, to speed up analysis, the recommended interface in ROOT (which you can also use from Python) is RDataFrame:
https://root.cern/doc/master/classROOT_1_1RDataFrame.html
Even when used from Python, pretty much everything happens in C++ (in particular, the event loop is C++). It also has implicit parallelisation on a multicore machine.
If there is some Python code that you'd like to use in conjunction with RDataFrame, there is the Numba.Declare feature:
https://root.cern.ch/doc/master/pyroot004__NumbaDeclare_8py.html
which will try to JIT with Numba the Python function you decorate.
Thanks for the suggestions! I built ROOT from source for two Python versions. Python 2.7 works fine, it imports ROOT without errors.
Thanks again for the links. Indeed, Numba allows a pretty neat syntax to translate Python functions to C++.
However, if there are many Python functions, it will require more work to decorate them all. Running PyPy would be easier in that case.
If there are no strong objections why PyPy can't be supported, I'm still leaving this as a feature request.
@guitargeek is this something which is more adequate for the cppyy tracker rather than ROOT's? If yes, can we close the item?
Hi @ynikitenko, thanks for the request! There are several reasons for not supporting PyPy.
PyROOT, which is building on top of cppyy, is very deeply connected to the CPython implementation of CPyCppyy. Just like the CPython version of cppyy, PyROOT consists of a Python module and a compiled CPython extension (libROOTPythonization). And you can't use CPython extensions in PyPy. So to support PyPy, we would have to rewrite all this code, and work closely with the cppyy team to make sure it's PyPy implementation also provides the interfaces that PyROOT expects. That would be a huge amount of work with no usecase to justify it.
And there are so many alternative of fast analysis frameworks that people are trying out right now: RDataFrame, NumPy, numba, awkward arrays, etc.
And let's not forget that pypy has it's limitations! It only supports a restricted subset of Python, and it would require large efforts to port PyROOT to this.
One final point: the future of PyPy development is not very clear. Right now, they don't support Python 3.11 yet, for example. Builds are only available for Python 3.9 and Python 3.10. So investing in PyPy compatibility is also risky because of that. Imagine we would spend half a year trying to support it, and then PyPy would not be maintained anymore :(
I'll therefore close this issue as "not planned".
tldr; the HEP community is gravitation towards other C++ native of jit-based Python packages to speed up analysis, and at this point investing in pypy is not worth it.
And you can't use CPython extensions in PyPy.
Yes, you can. Pythonizations and cppyy are a separate issue as I haven't kept the PyPy/cppyy implementation up to date due to lack of interest and using CPyCppyy as an extension module in PyPy won't give you the same level of speedup as the PyPy native cppyy module does.
It only supports a restricted subset of Python
No, PyPy supports the full Python language, but yes as you've mentioned it is always one or more versions behind. This is b/c there really isn't a "Python language". There's only "this coding runtime that CPython supports." Mostly, this is not an issue, of course.
As for "restricted subset", you're thinking about RPython in which the PyPy framework itself is written, which is irrelevant. As an analogy, CPython is written in C, but that doesn't mean that it only supports C: it supports Python in full.
Thanks a lot Wim for the clarifications!