phylanx
phylanx copied to clipboard
segfault in kmeans.phylanx.py
getting a segmentation fault on line 36 in kmeans.phylanx.py - any suggestions on how to provide better debugging information?
@parsa would you mind having a look?
It just works for me. @ct-clmsn do you have any idea how to reproduce it?
@parsa will give another shot this evening.
@parsa, ok, I started to narrow in on the issue. When empty debugging print statements are placed after every line in closest_centroid, the algorithm runs to completion without a segmentation fault.
When the alternating print statements are removed, the system segmentation faults. Maybe this is an edge case for the GIL/stdout issue?
The algorithm after the phylanx call returns results from phylanx directly into a python-side print statement (line:97); would that be a GIL/stdout edge case?
@ct-clmsn I'd really like to be able to see this error. What do I do to reproduce it?
I don't know what you mean by the edge case issue for line 97. That kmeans function call returns a centroids×2 NumPy array, which is then printed.
@parsa, to reproduce the error, I just run the script, and it segfaults (if run without the print statements). If it's working for you, I'm not sure how to reproduce the problem except to say, try running it on a couple of machines (I suspect you already have a couple of machines in the testing system)? If you have advice on generating a stack trace or other debugging output, please relay those tips. I can try to produce some output that might help us sort this out.
For line 97, the edge case might be that the result (the NumPy array returned from centroids) is returned from the Phylanx annotated function into a Python print statement. Maybe there's an issue getting the data from the Phylanx backend into the Python front-end for this particular use case?
The placement of print statements that "fixed" the issue on my systems gives me the suspicion that a possible GIL/stdout bug is happening during the return from that call into the Python front-end's print statement.
@ct-clmsn, try setting ulimit -c unlimited before executing, then when it crashes, you should get a core file you can load and examine with GDB.
@khuck thanks! trying to give that a shot.
@ct-clmsn I think @parsa is now able to reproduce this issue. We're working on a fix as we speak...
@parsa @hkaiser my apologies for the lack of a reasonably clear bug report!
Does this issue still exist?