cpython icon indicating copy to clipboard operation
cpython copied to clipboard

Improve detection of `Py_DECREF` on dictionaries in freelist

Open hansingt opened this issue 1 month ago • 12 comments

Crash report

What happened?

After updating my extension library from Python 3.11 to 3.14, I had a strage segmentation fault raised by my tests. The analysis of this issue took me about a week. The problem here was the introduction of freelists for Python Dictionaries:

When a Python dictionary is freed (refcnt == 0), it get's appended to a list of "free" objects, which may get reused by subsequent PyDict_New() calls. This freelist is implemented as a linked LIFO list. The link between the elements in this list is done by "reusing" the reference counter of the dictionary and storing the address of the next elements in the list in it. This can be problematic because of two reasons:

First of all, if I accidentally Py_DECREF a dictionary too many times, the reference counter does not become negative and hence, this issue is not detected. This is, because the reference counter will container a pointer to the next element as soon as it gets zero. Hence, the next call of Py_DECREF will read the pointer address as integer and assume it is just very large (or in best case negative, in which case it may fail). This is bad, as it hardens bug detection in debug builds.

But the real problem araises, if the integer representation of the pointer in this dict is not larget than the minimum reference count for immortals: If the integer is larger than this value, the dict is counted as immortal and nothing happens - happy case. If not, Py_DECREF (or Py_INCREF) will modify the reference count and thus, accidentally modifies the pointer! At this point the pointer is off by one and the freelist is broken.

Now, if two new dictionaries are requested, this can cause a segmentation fault on the second dict, because the process might read one byte behind its memory. In best case, this pops up near to the place, where the dict was decref'd, but in worse case, this situation can linger for a long time in the program and pop up in a completely unrelated place. E.g. even when doing d = {} in the Python code.

The following code tries to demonstrate this problem:

#include <Python.h>

void main() {
    PyObject *d1, *d2, *dict;
    Py_Initialize();

    // populate the freelist with two dicts
    d1 = PyDict_New();
    Py_DECREF(d1);

    d2 = PyDict_New();
    Py_DECREF(d2);

    // Now decref a dict too many times
    // For this, first create a new dict. It will take the first entry from the freelist (d2).
    dict = PyDict_New();
    assert(dict == d2);
    assert(Py_REFCNT(dict) == 1);

    // Now if we decref this dict, the reference counter will be overwritten
    // with the next element in the freelist (d1).
    Py_DECREF(dict);
    assert(Py_REFCNT(dict) == (Py_ssize_t)d1);

    // Now comes the bug: 
    // IFF (int)d1 < _Py_IMMORTAL_MINIMUM_REFCNT, the dict is not considered immortal
    // and a Py_DECREF will accidentally modify the address pointer
    Py_DECREF(dict);
    assert(PyUnstable_IsImmortal(dict) || Py_REFCNT(dict) == (Py_ssize_t)d1 - 1);
}

The output of this depends on the address of d1 and whether it's integer representation is larger than _Py_IMMORTAL_MINIMUM_REFCNT or not.

CPython versions tested on:

3.14

Operating systems tested on:

Linux, Windows

Output from running 'python -VV' on the command line:

Python 3.14.1 (main, Dec 2 2025, 19:40:56) [Clang 21.1.4 ]

hansingt avatar Dec 05 '25 06:12 hansingt

First of all, your code is not safe. assert(Py_REFCNT(dict) == (Py_ssize_t)d1); is not safe, because dict maybe be freed after previous Py_DECREF.

Second, we have Py_REF_DEBUG configuration option. In this case, if refcounter goes negative we get an assert. Is this option useful for you?

sergey-miryanov avatar Dec 05 '25 07:12 sergey-miryanov

First of all, your code is not safe. assert(Py_REFCNT(dict) == (Py_ssize_t)d1); is not safe, because dict maybe be freed after previous Py_DECREF.

This is just an example to demonstrate the issue. With the introduction of freelists for dicts, a dictionary will exactly not be freed, as long as the freelist is not full. If the freelist is full, the issue does not occurr.

Second, we have Py_REF_DEBUG configuration option. In this case, if refcounter goes negative we get an assert. Is this option useful for you?

That's exactly, what I was trying to tell: It may not get negative, because the reference counter stores the address of the next element in the freelist and will be interpreted as Py_ssize_t in Py_DECREF. Hence, if the integer representaion of the address in the reference counter is not negative, this is not recognized by Py_DECREF, even with Py_REF_DEBUG enabled. That's exactly the first part of my issue here.

The second part then is, if the integer representation of this address is not large enough for the dict to be considered immortal, it does even get modified by Py_DECREF, corrupting the freelist.

hansingt avatar Dec 05 '25 07:12 hansingt

Ough, I see that you mean.

It is where we update linked lists: https://github.com/python/cpython/blob/53ec7c8fc07eb6958869638a0cad70c52ad6fcf5/Include/internal/pycore_freelist.h#L55-L61

sergey-miryanov avatar Dec 05 '25 07:12 sergey-miryanov

@colesbury Could you please take a look?

sergey-miryanov avatar Dec 05 '25 07:12 sergey-miryanov

After updating my extension library from Python 3.11 to 3.14, I had a strage segmentation fault raised by my tests

So if I understand correctly, your extension called Py_DECREF on freed dictionaries and this wasn't detected on 3.11. On 3.14, this routinely crashed. That seems fine?

We had freelists for dictionaries before 3.14. They were just implemented differently and 3.14 unified the freelist implementation across many objects.

Using the first word of a freed memory block for the freelist is a very normal thing for allocators to do. We can consider adding extra detection for freelist corruption, but you should expect that your program will crash or misbehave if you call Py_DECREF on a freed object.

colesbury avatar Dec 05 '25 14:12 colesbury

@colesbury I agree with you. However, it seems that Py_REF_DEBUG has become less useful. Even if we aren't crashing the freelist, we can decref an object that we don't own.

sergey-miryanov avatar Dec 05 '25 14:12 sergey-miryanov

Here is a relatively simple idea: In Py_REF_DEBUG, invert (~) the pointer used for the freelist. That will make typically positive pointers look like negative refcounts.

colesbury avatar Dec 05 '25 15:12 colesbury

So if I understand correctly, your extension called Py_DECREF on freed dictionaries and this wasn't detected on 3.11. On 3.14, this routinely crashed. That seems fine?

Of course not. I know, that this is a bug in our software and that it can cause misbegaviour or crashes. I don't know. why we did not detect it in 3.11, but at least with 3.14, we couldn't detect it with Py_REF_DEBUG.

Here is a relatively simple idea: In Py_REF_DEBUG, invert (~) the pointer used for the freelist. That will make typically positive pointers look like negative refcounts.

Integer representations of pointers can be negative as well, of the memory address is high enough. You won't detect these then.

What I know, what other allocators so in these cases is using a tagged pointer: The memory blocks do have some alignment you know at compile time. This leaves some of the lower bits unused. Theseccan then be used to differentiate between a pointer and a reference count. Of course, this would "steal" one bit from reference counting, but I don't think hat this is problematic, as everything above 2^30 is counted as immutable anyway I think. So there are two bits left.

hansingt avatar Dec 06 '25 05:12 hansingt

Any thoughts about this? Can we improve this somehow?

hansingt avatar Dec 15 '25 09:12 hansingt

Why not just have the python runtime incref and decref the dictionaries in the freelist for you?

Incref and create on type creation with expandable capacity, then decref when the type gets dealloc'd by the runtime as well.

AraHaan avatar Dec 15 '25 10:12 AraHaan

Why not just have the python runtime incref and decref the dictionaries in the freelist for you?

Incref and create on type creation with expandable capacity, then decref when the type gets dealloc'd by the runtime as well.

I think, I cannot follow you. Do you suggest, that objects in the freelist should have a reference counter unequal to zero? Or what's your suggestion here?

The problem is not the reference couting itself, but the fact that the reference-counter is reused by the allocator to store the pointer to the next elements in the linked-list. This causes the reference counter to "jump" to a (pseudo) random value when it should become zero and we currently cannot detect this situation to throw an exception in case of another (faulty) decref.

hansingt avatar Dec 15 '25 10:12 hansingt

I think, I cannot follow you. Do you suggest, that objects in the freelist should have a reference counter unequal to zero? Or what's your suggestion here?

I am saying to have the runtime that constructs the object to maintain the reference count of the linked-list inside the object instead while also being able to add/edit/remove elements inside of it without issues.

AraHaan avatar Dec 16 '25 15:12 AraHaan

I am saying to have the runtime that constructs the object to maintain the reference count of the linked-list inside the object instead while also being able to add/edit/remove elements inside of it without issues.

Hm.. I'm still not sure I really understand you, but in fact, that's exactly what's happening? A PyDict_New() will get an object from the freelist (if available) and incref it before returning it to you. When you decref it again and the reference counter get's zero, it will be "dealloc'd", meaning it will be put back into the freelist. That's exactly the problematic part here: Putting it into the freelist means, that the reference counter get's overwritten with the pointer to the next element in the freelist (it's a LIFO queue).

Due to this , when (accidentally) calling Py_DECREF once again, it will interpret the pointer address as an integer.

  • In the good case, this integer is negative, which triggers the Py_REF_DEBUG assertion about the reference count being greater than zero.
  • In bad case, the integer is larger than the _Py_IMMORTAL_MINIMUM_REFCNT and Py_DECREF cannot detect this, as it just assumes, that it is an immortal object and does nothing.
  • In worst case, the integer is positive but not larger than _Py_IMMORTAL_MINIMUM_REFCNT. In this case, Py_DECREF does not only not detect this situation, but also does modify the referefence counter and thus, the pointer. This then corrupts the freelist and may cause segmentation faults, when trying to access the (now misaligned) pointer.

What I wanted to solve with this issue is, that Py_DECREF should be able to detect the situation, when the reference counter actually is a pointer and thus throws an exception in this case, just like it is supposed to do with Py_REF_DEBUG.

The only solution I could come up with is using the tagged pointers I already mentioned. But with this, we need to carfully think about existing extensions which are built against older versions, as they are using the lower bits for reference counting. I'm not sure, how to solve this, besides shifting the pointer right and using the upper most bit as flag. This, of course, complicates reading the pointer as you cannot simply read with a bitmask but need to shift left before accessing it. Might be okay though.

Maybe an example might help here:

The upper most bit of a signed integer denotes the sign (positive, negative) of the number. Hence, for your reference counting, this bit is unused (negative reference counts are not allowed). On the other hand, on modern architectures, pointers are aligend to multiples word sizes. Let's assume our hardware is 2-byte aligend (you can only access bytes 0, 2, 4, ...). In this case, the lowest bit is unused.

This means, we can shift our pointers one bit to the right and use the upper most bit as flag, by setting it to 1:

0b0101000110110001011010101101010001100110111110000000100101101100 will become 0b1010100011011000101101010110101000110011011111000000010010110110 (note the 1 in upper most bit).

The benefit now is: As Py_DECREF interprets this as a signed integer, it will always be negative and thus, trigger the existing assertion!

When accessing the pointer, we just need to shift it one bit to the left.

hansingt avatar Dec 17 '25 07:12 hansingt