python-systemd icon indicating copy to clipboard operation
python-systemd copied to clipboard

SIGSEGV using `journal.Reader`, thread safety issue?

Open sebres opened this issue 9 months ago • 3 comments

Env:

$ python3 --version
Python 3.11.2
$ systemctl --version
systemd 252 (252.36-1~deb12u1)
+PAM +AUDIT +SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT +QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK -XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified
$ python3 -c "from systemd import journal; print(journal.__version__)"
235

Issue:

I got sporadically a segfault using journal.Reader from python-systemd module. Thereby it seemed to happen by stop of thread, monitoring journal with Reader.

I could reproduce it also under dgb, here is the callstack:

[New Thread 0x7ffff5f876c0 (LWP 89339)]

Thread 2 "f2b/f.DummyJail" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff5f876c0 (LWP 89339)]
0x00007ffff6f86d63 in ?? () from /lib/x86_64-linux-gnu/libsystemd.so.0
(gdb) where
#0  0x00007ffff6f86d63 in ?? () from /lib/x86_64-linux-gnu/libsystemd.so.0
#1  0x00007ffff6f877f5 in ?? () from /lib/x86_64-linux-gnu/libsystemd.so.0
#2  0x00007ffff6f6a848 in ?? () from /lib/x86_64-linux-gnu/libsystemd.so.0
#3  0x00007ffff6f99502 in ?? () from /lib/x86_64-linux-gnu/libsystemd.so.0
#4  0x00007ffff6f69403 in ?? () from /lib/x86_64-linux-gnu/libsystemd.so.0
#5  0x00007ffff6f6961e in ?? () from /lib/x86_64-linux-gnu/libsystemd.so.0
#6  0x00007ffff6f6d11a in sd_journal_get_fd () from /lib/x86_64-linux-gnu/libsystemd.so.0
#7  0x00007ffff6f6f3c9 in sd_journal_wait () from /lib/x86_64-linux-gnu/libsystemd.so.0
#8  0x00007ffff6cadf2c in ?? () from /usr/lib/python3/dist-packages/systemd/_reader.cpython-311-x86_64-linux-gnu.so
#9  0x0000000000548ac8 in ?? ()
#10 0x00000000005188b3 in _PyObject_MakeTpCall ()
#11 0x000000000052c2f0 in _PyEval_EvalFrameDefault ()
#12 0x0000000000585e54 in ?? ()
#13 0x0000000000584e98 in ?? ()
#14 0x000000000053034e in _PyEval_EvalFrameDefault ()
#15 0x0000000000585e54 in ?? ()
#16 0x0000000000584e98 in ?? ()
#17 0x000000000067b32c in ?? ()
#18 0x0000000000656384 in ?? ()
#19 0x00007ffff7d381f5 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#20 0x00007ffff7db889c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

It can be either some missing lock or atomic handling somewhere in journal.Reader, because if I rewrote the code to avoid too earlier clean-up, closing the journal with Reader.close() before join of thread (now it ensures that the thread working with Reader is really stopped, before clean-up happens), it doesn't happen anymore. So my assumption is if TH2 is somewhere within reader.wait() and TH1 would invoke reader.close(), it could segfault. Sure, it can be avoided on the user side, just SF is SF and shall not happen normally.

Naive attempts to reproduce it with simple wait/close in different threads did not work, but I could find a bit complex scenario (with a busy-wait & lock using to synchronize the timing a bit), where it is pretty reproducible, almost every time not later than on 10 iteration:

import time
from systemd import journal
from threading import Thread, Lock
l = Lock(); # just used for "better" timing here to get SF

def test():
  stat = {};
  jr = journal.Reader()
  def mon():
    print('start')
    try:
      jr.seek_head()
      print('work')
      stat['started'] = 1
      l.acquire(); l.release()
      # *-* [A], thread is here as another thread is on *-* [B]
      while (j := jr):
        if j.wait(3) in (journal.NOP, journal.APPEND):
          j.get_next()
    except Exception as e:
      print('ERR: %s' % e)
    if jr: jr.close()
  th = Thread(target=mon); 
  th.start()
  # just "sync" wait and close calls:
  while not stat: time.sleep(0.00001);
  l.acquire(); l.release()
  # *-* [B], thread is here as another thread is on *-* [A]
  # do evil close:
  jr.close(); jr = None
  th.join()
  print('OK')

for i in range(100): # just 100 attempts, because of sporadic matter
  test()

sebres avatar Mar 30 '25 03:03 sebres

As the libsystemd' documentation says:

All functions listed here are thread-agnostic and only a single specific thread may operate on a given object during its entire lifetime. It's safe to allocate multiple independent objects and use each from a specific thread in parallel. However, it's not safe to allocate such an object in one thread, and operate or free it from any other, even if locking is used to ensure these threads don't operate on it at the very same time.

Basically this mean the pythonic systemd.journal.Reader is not thread-safe currently and normally, Reader.close() or whatever other functions shall be prohibited across the threads. Completely, regardless the GIL and co.

sebres avatar Apr 01 '25 13:04 sebres

You are right that the the C code is "thread agnostic", i.e. only one thread can ever access those objects at a single time. This means that those Py_BEGIN_ALLOW_THREADS/ Py_END_ALLOW_THREADS are generally broken. You observed a segfault if .close() was called at the wrong time, but in general, the problem is much wider. If the journal object is passed to multiple threads, and they make any kind of calls on the the object, that will create concurrent calls into the journal C API from multiple threads. So just adding reference counting to prevent the close from happening is not enough.

One broad possibility would be to remove all those Py_BEGIN_ALLOW_THREADS/ Py_END_ALLOW_THREADS. Then the GIL will protect us, because only one Python thread will ever be active, and the journal object is accessible only through the Python object.

The other option would be put all the access to journal object behind a separate mutex. I think that for Python 3.13+ we'll want to do something like this anyway, because of free-threading mode. With ff, different Python threads are allowed to access the same Python objects at the same time. We could either say that the objects provided this module cannot be shared between threads, or add a new mutex around each journal object. I think this is a nicer option, but I only started reading about the details of ff mode, so I'm not sure if I have the full picutre.

keszybz avatar Jul 01 '25 09:07 keszybz

You observed a segfault if .close() was called at the wrong time, but in general, the problem is much wider.

But only close() is dangerous in the way that it'd even segfault (because it destroying the handle, which can be accessed by another thread at the same time). The issues with interop cross-thread access (without to consider close) may surely cause some undefined behavior, but they are at least "safe" against segfault and doesn't cause a crash. Normally a documentation entry would be enough, and actually I don't think someone really want to access journal multi-threaded. Close() is rather an exception and used from another thread just in order to signal "stop" of monitoring if another thread is in waiting state.

Then the GIL will protect us, because only one Python thread will ever be active

And globally lock for several seconds, if some waiting operation is involved? I don't think it is alright.

The other option would be put all the access to journal object behind a separate mutex.

As I wrote above it is not necessary in my opinion. Let alone I don't think it'd be good idea generally. I think fixing close() is important, another stuff seems to be matter of UB either, at least as long as systemd API doesn't get completely "multi-threaded". However I guess the set of functions called thread agnostic mostly refers to issue with the close(). For anything else I'm sure the entry in the documentation would be enough.

sebres avatar Jul 01 '25 11:07 sebres