convertbng icon indicating copy to clipboard operation
convertbng copied to clipboard

Intermittent Segmentation faults on Ubuntu-20.04 LTS on WSL2

Open joshuanunn opened this issue 3 years ago • 8 comments

Hi, I've been using this for a while on both Ubuntu-20.04 within WSL2.

I've recently noticed that the Python script aborts with "Segmentation fault" in some cases, but unfortunately it's a little intermittent - it only occurs during one long processing run, but sometimes happens earlier or later. This involves lots of individual calls to convert shorter line strings (supplied as lists to convert_bng).

If I make individual calls with each of these linestrings within the same venv, these work fine - the problem only arises when the processing script runs as a whole.

All the 0.6.* versions seem to have this same behaviour. The latest prior version that I seem to be able to install is 0.4.4, which appears to work fine, but uses OSTN02.

Details:

  • System installed Python 3.8.5.
  • Additional libraries including this one installed in virtualenv using pip.
  • No other problems to date with NumPy or any other Python libraries I can think of.

I appreciate it's not much to go on, but let me know if you would like some more information!

joshuanunn avatar Oct 19 '21 17:10 joshuanunn

I won't be able to reproduce this under WSL2 as I don't have access to a windows machine, but I can try to reproduce it under Linux and / or macOS – I'll need an example dataset that causes the failure, though.

urschrei avatar Oct 20 '21 10:10 urschrei

To follow up, I reproduced the dev environment (including Python 3.8.5) on a native on-metal Ubuntu-18.04 machine, and I still get the same Segfault using ConvertBNG 0.6.25, so doesn't appear related to WSL2 specifically. Note that the CPUs for both machines were i7 based.

I then switched to Python 3.6.9, which was already on the machine and installed everything in a venv again - in contrast, this worked fine. After switching back to Python 3.8.5 and experimenting, I can see that it is specifically fast repeated calls that seems to trigger it (or makes more likely).

Here is a minimum working example that consistently fails:

from convertbng.util import convert_bng lots_of_conversions = [convert_bng([-1.89983], [52.48142]) for x in range(10000)]

I had to relax the library versions to get everything installed in Python 3.6.9 - the key difference in terms of requirements for convert_bng was that the numpy version (numpy==1.19.5 for Python 3.6.9 and numpy==1.21.2 for Python 3.8.5). However, doing a minimum install of numpy and convert_bng in Python 3.8.5 still segfaulted for both versions of numpy.

Hence, I'm guessing there's an issue in the latest convert_bng wheels for python 3.8?

Happy to provide more info if needed - thank you.

joshuanunn avatar Oct 20 '21 10:10 joshuanunn

Well, I can trigger a segfault with that data…

urschrei avatar Oct 20 '21 10:10 urschrei

Just as a matter of interest, are you using the (slower!) ctypes functions for a reason? Switching to

from convertbng.cutil import convert_bng causes the test to pass on my machine.

urschrei avatar Oct 20 '21 11:10 urschrei

Perfect, your suggested change also makes everything work fine again on my side for both the test and the processing scripts!

No reason at all - in fact speed in this application isn't too important, more just the library functionality. It just hadn't occurred to me to try, thank you for the suggestion!

joshuanunn avatar Oct 20 '21 12:10 joshuanunn

No prob! I'm going to leave this open for now as there's clearly something wrong with my ctypes code.

urschrei avatar Oct 20 '21 12:10 urschrei

Great, I imagine this issue won't affect too many users anyway, as it's only likely to occur when making a lot of repeated smaller calls, rather than larger batches.

Thank you for your time looking at this earlier - it's much appreciated!

joshuanunn avatar Oct 20 '21 12:10 joshuanunn

Hi, I can confirm the intermittent segfaults on a completely different setup (native x86_64 Arch Linux, Python 3.10.9).

Just like joshuanunn, switching to cutil fixed this - but it was a real head scratcher until then!

tmladek avatar Feb 05 '23 15:02 tmladek