lingua-franca icon indicating copy to clipboard operation
lingua-franca copied to clipboard

Python target memory error

Open petervdonovan opened this issue 2 years ago • 5 comments

EDIT: The original bug that led to this post is unrelated to anything we have in master. However, master still apparently has illegal memory accesses according to this Memcheck output.

~~Background~~

~~In the c-refactoring2 branch, I observed memory errors in ActionDelay.lf for the Python target only (not C). Valgrind + memcheck confirmed that Python only (not C) has an invalid free in static token_freed _lf_done_using(lf_token_t* token).~~

~~I find it curious that we could have such a problem in Python while the C runtime itself passes memcheck.~~

~~Issue~~

~~It appears that the same problem also exists in master -- it just does not manifest itself except in memcheck. It seems possible that having the shared library compiled differently could somehow change the result of an invalid read (e.g., the result of an invalid read in master is NULL, whereas in c-refactoring2, the result is garbage).~~

~~However, this is just speculation -- I'm pretty confused about what is going on and would appreciate any hints.~~

~~Details~~

~~Here is the output of Memcheck when run in master.~~

~~Here is the stacktrace I get from GDB after compiling the shared library in Debug mode in the c-refactoring2 branch. It shows that a segmentation fault occurs in _lf_done_using.~~

petervdonovan avatar Aug 02 '22 05:08 petervdonovan

The gdb trace is not reporting the exact line number, so I'll go by my hunch. You might want to use a combination of sudo apt install python3-dbg python3-dev and py-bt instead of bt. Hopefully that will reveal more information.

As for the root cause of this segfault, it might have something to do with the fact that in CPython, there is built-in garbage collection for PyObject*s. Token values are PyObject*s in the Python target, so they are created using CPython APIs and they are supposed to be automatically garbage collected. Freeing token values that are PyObject* using free could cause a segfault (because the memory is not allocated using malloc in the first place) and is generally not a good idea.

To accommodate for this, there is a _LF_GARBAGE_COLLECTED compile definition in the C runtime. If defined, the C runtime should not free token values. From the stack trace, it looks like free is being called on a PyObject*.

Soroosh129 avatar Aug 02 '22 07:08 Soroosh129

To accommodate for this, there is a _LF_GARBAGE_COLLECTED compile definition in the C runtime.

I was not aware of that. Tomorrow I will try to confirm that that is the problem so that we can close this issue. Thanks!

petervdonovan avatar Aug 02 '22 07:08 petervdonovan

We also do a lot of manual adjustments to the ref counts of PyObject*s in reactor-c-py using the CPython APIs like Py_INCREF and Py_DECREF. It is almost guaranteed that there is a mistake somewhere and ref counts for some objects won't ever get to zero. This could cause a runaway memory leak.

Soroosh129 avatar Aug 02 '22 07:08 Soroosh129

For the record, you were right @Soroosh129 -- I revised the description of this issue accordingly. Thank you!

petervdonovan avatar Aug 02 '22 22:08 petervdonovan

I used the "debug" version of Python and the volume of warnings emitted by memcheck decreased for some reason. The warnings that remained look like these warnings that are suppressed here. I wonder what that means... Maybe the memcheck errors are a false alarm.

petervdonovan avatar Aug 08 '22 07:08 petervdonovan

Is this in any way related to #1717?

lhstrh avatar May 22 '23 05:05 lhstrh