jep icon indicating copy to clipboard operation
jep copied to clipboard

Create DirectNDArray based on numpy ndarray

Open devcrocod opened this issue 5 years ago • 4 comments

There is ndarray in python, which we received as a result of some manipulations. At the moment, in Java, we can get a simple NDArray, but in order not to copy all the values, it is possible to use the same memory that the existing ndarray uses, using NewDirectByteBuffer from JNI.

For example: commit

devcrocod avatar Aug 01 '19 08:08 devcrocod

I would love to have this feature and your code is on the right track. I have considered this before but the problem that always stops me is how to clean up the memory. If I understand your code correctly then you don't hold any reference to the ndarray after you get the buffer, so Python will free the memory as soon as there are no references, so I expect this code example to be very dangerous:

jep.eval("a = numpy.ndarray((10,),'B')");
ByteBuffer data = jep.getBuffer("a");
// You can use the buffer happily here
jep.eval("a = None"); // Python will free the memory
// Using the buffer here is undefined.

Using the buffer after it has been freed in Python might work for some people, hopefully it will seg fault for most people and occasionally it might just access random memory. We could potentially fill the javadoc with warnings against this use case, it would help for simple use cases but I expect most users want to pass the ByteBuffer on for more complex processing or into other libraries where it might not be so easy to control the lifecycle of the ByteBuffer. Jep joins two languages that are supposed to protect you from this kind of low level memory management and I do not want to expose this type of dangerous behaviour through Jep.

I think we can do slightly better utilizing the MemoryManager that @ndjensen added in 3.8. If we create a PyPointer for the ndarray, then we can incref it to prevent Python freeing the memory and when garbage collection comes for the DirectByteBuffer we can decref to free it. That process might be slow, waiting for garbage collection and then further waiting for the memory manager to see the change, I worry it looks like a memory leak to some users, but in the end it is not an actual leak so I think this should work well, until...

What happens when someone calls Jep.close() but we still hold a reference to the ndarray. For PyObject we can just invalidate the object and force the user to stop accessing it. But there is no way to invalidate a DirectByteBuffer. If we free the ndarray then the DirectByteBuffer would be invalid. If we don't free the array but close the python interpreter then we no longer have the MemoryManager and we won't free it later, leaking memory. I think we would have to throw an exception and prevent the interpreter from closing. An exception seems most correct but I don't think it actually helps the caller, they will probably stop using the Jep which means the memory gets leaked anyway. Ideally they should close Jep again later but I doubt callers will reliably wait for garbage collection and try again. Maybe we should run System.gc() before throwing an exception from close(), I think that would make things better but some users might not be pleased if we stall the entire JVM.

All of the complexity introduced by Jep.close() can be avoided for SharedInterpreters. It should be completely valid to create a ndarray on one SharedInterpreter and then free it on a different SharedInterpreter, so it would be valid to close() a SharedInterpreter with living DirectByteBuffers as long as other SharedInterpreters are watching the MemoryManager and freeing them. But currently SharedInterpreters do not share a MemoryManager so that would need to be implemented first. Also at the end of the day we would like the feature supported on both SubInterpreters and SharedInterpreters so we would still need the code to deal with close() on non-shared Jep instances.

Sorry for the long reply, I hope it mostly makes sense. In summary, getting the shared memory is as easy as you describe but correctly and safely freeing the memory is going to be more complicated. I've tried to explain the best solutions I have to the problems, it would be great if there are simpler solutions so let me know if you see something I am overlooking. I have not had time to put any of my ideas into code to try them out. It would be great if you could continue to evolve your code to include safe freeing, if you can get all the issue worked out this is a feature I would like to include in jep.

bsteffensmeier avatar Aug 01 '19 18:08 bsteffensmeier

Here is a working example:

LongBuffer buffer = null;

try (Jep jep = new Jep(new JepConfig().addSharedModules("numpy").setRedirectOutputStreams(true))) {
    jep.eval("import numpy as np");
    jep.eval("x = np.array([1, 2, 3])");

//  This is workaround, since an error occurs on PyArray_NBYTES 
//  if you don't perform actions with jep before. I don't know why.
    jep.getValue("x.shape");

    buffer = jep.getBuffer("x").order(ByteOrder.nativeOrder()).asLongBuffer();
// Here we can use buffer
    jep.eval("x = None");
// And here we can use buffer
} catch (JepException ignore) {} // close
// And here. also we can get the values 1, 2, 3

Since Java has capture the memory, jvm will alse manage it. So when the buffer is no longer needed and gc is built (Java does not provide us with the opportunity to explicity free memory), we hope that this memory is free. At this point, need to check for leaks.

I agree with you that needed to communicate with arrays through PyPointer, since it will allow to manage the ndarray more reliably.

It turns out like this:

  1. If gc collected out buffer, then we lose everything that was there and error may occur if we call same jep.eval("x"). To do this, we can return from JNI DirectNDArray which will have a buffer field. The when the DirectNDArray object is destroyed, we can destroy x in the python.
  2. If we destroy object x in python, then we can still use our buffer. As shown above.
  3. close(). It is very difficult to say. In theory, we can destroy everything sequentially, but if we sequentially free up resources, this will lead to an error (for example, the gc worked before closing).

Since Java9 Oracle has marked finalize Deprecated. Instead, they suggested using phantom links. Do you know something about this? I will think more about and see if will possible to apply them in this case.

devcrocod avatar Aug 02 '19 11:08 devcrocod

I have never seen any documentation that indicated the JVM will manage memory when you use NewDirectByteBuffer from JNI. Although not official documentation this stackoverflow question confirms that Java is not managing the memory and we are responsible for freeing it. And since Python/numpy allocated the memory, we need to use python api to free it.

In your example I suspect the memory was freed, but there is nothing that is going to actually remove the data so you can still access it. This is undefined behaviour and we cannot safely rely on it. To verify my understanding I pulled your code and ran the following test. On my system I have noticed in the past that after an object is freed if you create a similar object it will usually recycle the same memory. This is another undefined behaviour so you may see different results running this example on a different system. I would never rely on this behaviour in production but for a test it demonstrates that python is freely recycling the memory from the array and we cannot trust the contents of the Direct Buffer after python has freed the objects.

try (Jep jep = new Jep(new JepConfig())){
    jep.eval("import numpy as np");
    jep.eval("x = np.array([1, 2, 3], dtype='q')");
    jep.getValue("x.shape");
    LongBuffer buffer = jep.getBuffer("x").order(ByteOrder.nativeOrder()).asLongBuffer();
    System.out.println(buffer.get(0)); // Output: 1
    jep.eval("x = None");
    System.out.println(buffer.get(0)); // Output: 1
    jep.eval("x = np.array([1, 2, 3], dtype='d')");
    System.out.println(buffer.get(0)); // Output: 4607182418800017408
    System.out.println(Double.longBitsToDouble(buffer.get(0))); // Output: 1.0
    jep.eval("x = None");
    jep.eval("x = np.array([1, 2, 3], dtype=object)");
    System.out.println(buffer.get(0)); // Output: 139671068077536
    jep.eval("print(id(1))"); // Output: 139671068077536
    buffer.put(0,0xDEADBEEF);
    jep.eval("print(x[0])"); // SIGSEGV
}

bsteffensmeier avatar Aug 02 '19 16:08 bsteffensmeier

This ticket is in limbo until someone can solve the memory freeing issues described by @bsteffensmeier.

ndjensen avatar Oct 13 '20 20:10 ndjensen