JyNI icon indicating copy to clipboard operation
JyNI copied to clipboard

NumPy 1.14+ compatibility

Open Stewori opened this issue 6 years ago • 8 comments

... is currently broken:

Traceback (most recent call last):
  File "/data/workspace/linux/JyNI/JyNI-Demo/src/JyNINumPyTest.py", line 68, in <module>
    import numpy as np
  File "/data/workspace/linux/numpy/1.14.0/numpy/__init__.py", line 168, in <module>
    from . import ma
  File "/data/workspace/linux/numpy/1.14.0/numpy/ma/__init__.py", line 44, in <module>
    from . import core
  File "/data/workspace/linux/numpy/1.14.0/numpy/ma/core.py", line 6333, in <module>
    masked = masked_singleton = MaskedConstant()
  File "/data/workspace/linux/numpy/1.14.0/numpy/ma/core.py", line 6270, in __new__
    cls.__singleton = MaskedArray(data, mask=mask).view(cls)
  File "/data/workspace/linux/numpy/1.14.0/numpy/ma/core.py", line 2790, in __new__
    _data = ndarray.view(_data, cls)
AttributeError: 'type$1' object has no attribute 'update'

Stick to NumPy 13.3 until this is solved...

Stewori avatar Jan 17 '18 21:01 Stewori

I just investigated newer NumPy versions. 14.1 and 14.2 result in the same error as 14.0 above. NumPy 14.3, 14.4, 14.5 give this new error:

('1.8.0_171', 'Oracle Corporation', ('OpenJDK 64-Bit Server VM', '25.171-b11', 'Oracle Corporation'), ('Linux', '4.4.0-128-generic', 'amd64'))
Traceback (most recent call last):
  File "/data/workspace/linux/JyNI/JyNI-Demo/src/JyNINumPyTest.py", line 73, in <module>
    import numpy as np
  File "/data/workspace/linux/numpy/1.14.5/numpy/__init__.py", line 166, in <module>
    from . import random
  File "mtrand.pyx", line 1, in init mtrand
  File "/data/workspace/linux/numpy/1.14.5/numpy/__init__.py", line 118, in <module>
    __NUMPY_SETUP__
TypeError: unhashable type: 'str'

Stewori avatar Jun 30 '18 20:06 Stewori

So, I located the first type of failure to NumPy's source file multiarray/ctors.c. Specifically in the function PyArray_NewFromDescr_int the line res = PyObject_Call(func, args, NULL); fails, i.e. returns zero, triggering the subsequent goto fail;. Next step is to identify why this call fails. For now I only know that it emits a bunch of Python-exceptions, where the message that we see, i.e. 'type$1' object has no attribute 'update' is only the tip of the iceberg. The full sequence of error messages is:

'numpy.ndarray' object has no attribute '_fill_value'
'numpy.ndarray' object has no attribute '_hardmask'
'numpy.ndarray' object has no attribute '_sharedmask'
'numpy.ndarray' object has no attribute '_isfield'
'numpy.ndarray' object has no attribute '_baseclass'
'type$1' object has no attribute 'update'

So far I conclude that somehow attribute (write-)access of numpy.ndarray got messed up.

Stewori avatar Jul 01 '18 13:07 Stewori

Tracked it down a bit further: It fails in the method MaskedArray._update_from in ma/core.py. Specifically the line self.__dict__.update(_dict) fails. Here _dict contains a bunch of updates for attributes that shall be written to self.__dict__. It seems that for some reason self.__dict__ might not be an ordinary dictionary, which somehow causes the issue. type(self.__dict__) results in <type 'type$1'>, which is familiar from our error message 'type$1' object has no attribute 'update'. print(self.__dict__) outputs <attribute '__dict__' of 'MaskedArray' objects>. So maybe it's an issue with attribute wrappers while the dictionary itself is fine. It seems like the key lies in understanding what type$1 actually is.

The MaskedArray class extends ndarray and I remember that I once observed Astropy to fail with JyNI because of ndarray subclassing. Maybe before version 14.0 the NumPy import worked without subclassing ndarray. Unfortunately the commit history of ma/core.py is insanely complex and it is challenging to track down the breaking commit. I thought about using git's bisect feature, but that's probably tedious because of all the rebuilds required. For now I will just continue investigation. If I can fix ndarray subclassing in JyNI, then it hopefully also improves usecases like in Astropy.

Stewori avatar Jul 02 '18 14:07 Stewori

Note that the CPython output is as one would naively expect: Instead of <type 'type$1'> we get <type 'dict'> and instead of <attribute '__dict__' of 'MaskedArray' objects> we get {}. So somehow JyNI, or maybe Jython already, messes up this attribute. Maybe it just misses to evaluate the attribute wrapper properly.

Stewori avatar Jul 04 '18 18:07 Stewori

The "dict" object of type <type 'type$1'> is an instance of PyDataDescriptor, see PyDataDescriptor.java. type$1 is a generated type (as I already expected actually). PyDataDescriptor is abstract and Jython generates concrete implementations at runtime, resulting in these strange type names. Now the real question is why JyNI does not evaluate __get__. Maybe the case of extending natively defined builtin-like types is not yet properly covered, an observation that is consistent with the way how Astropy fails. I will review the way how new-style classes are handled by JyNI. Originally the described case should have been workable, but that appears to be incomplete. Maybe this will require something like PyCPeerDerived. We'll see...

Stewori avatar Jul 05 '18 18:07 Stewori

Quick experiment: Inserted this code snippet into PyCPeer.__findattr_ex__ right before the return statement:

if (er != null && er.implementsDescrGet() && er.isDataDescr()){
    er = er.__get__(this, objtype);
}

Now the descriptor is evaluated, but so far this doesn't fix anything because it just evaluates to None. It results now in AttributeError: 'NoneType' object has no attribute 'update' instead of AttributeError: 'type$1' object has no attribute 'update' (Is this an improvement?). It seems like the generated implementation of PyDataDescriptor does not know what to do with a PyCPeer...

Stewori avatar Jul 05 '18 18:07 Stewori

The recently released NumPy 1.15.0 gives the following error on import:

/usr/lib/jvm/java-8-openjdk-amd64/bin/java:
symbol lookup error:
/data/workspace/linux/numpy/1.15.0/numpy/core/multiarray.so:
undefined symbol:
PyStructSequence_InitType

However, I suppose this is easier to fix. Adding a PyStructSequence_InitType implementation is not a big deal. However I suspect there will be subsequent issues.

Stewori avatar Jul 26 '18 17:07 Stewori

Did you find any solution for this yet? Even a temporary hardcoded patch would do. I need to use sequitur-g2p with Jython and sadly it requires NumPy 1.14.2 or higher.

ShaharZivan avatar Nov 27 '18 13:11 ShaharZivan