cpython icon indicating copy to clipboard operation
cpython copied to clipboard

Use-after-free in `save_picklebuffer` via re-entrant `buffer_callback` and `__bool__`

Open jackfromeast opened this issue 2 weeks ago • 10 comments

What happened?

In save_picklebuffer a user buffer_callback returns an object whose __bool__ releases the PickleBuffer, freeing the bytearray while the function still holds view->buf. The pickler then proceeds into _save_bytearray_data, which copies from the stale pointer and triggers a use-after-free.

Proof of Concept:

import io
import pickle

base = bytearray(b'A' * 0x1000)
pb = pickle.PickleBuffer(base)

class Evil:
    def __bool__(self):
        global base
        pb.release()
        base = None
        return True

def callback(pb):
    return Evil()

pickle.dumps(pb, protocol=5, buffer_callback=callback)

Affected Versions

Python Version Status Exit Code
Python 3.9.24+ (heads/3.9:111bbc15b26, Oct 28 2025, 16:51:20) ASAN 1
Python 3.10.19+ (heads/3.10:014261980b1, Oct 28 2025, 16:52:08) [Clang 18.1.3 (1ubuntu1)] ASAN 1
Python 3.11.14+ (heads/3.11:88f3f5b5f11, Oct 28 2025, 16:53:08) [Clang 18.1.3 (1ubuntu1)] ASAN 1
Python 3.12.12+ (heads/3.12:8cb2092bd8c, Oct 28 2025, 16:54:14) [Clang 18.1.3 (1ubuntu1)] ASAN 1
Python 3.13.9+ (heads/3.13:9c8eade20c6, Oct 28 2025, 16:55:18) [Clang 18.1.3 (1ubuntu1)] ASAN 1
Python 3.14.0+ (heads/3.14:2e216728038, Oct 28 2025, 16:56:16) [Clang 18.1.3 (1ubuntu1)] ASAN 1
Python 3.15.0a1+ (heads/main:f5394c257ce, Oct 28 2025, 19:29:54) [GCC 13.3.0] ASAN 1

Vulnerable Code

/* Buggy Re-entrant Path */
static PyObject *
_pickle_Pickler_dump_impl(PicklerObject *self, PyTypeObject *cls,
                          PyObject *obj)
{
    PickleState *st = _Pickle_GetStateByClass(cls);
    /* ... */
    if (dump(st, self, obj) < 0)
        return NULL;
    /* ... */
    Py_RETURN_NONE;
}

static int
save_picklebuffer(PickleState *st, PicklerObject *self, PyObject *obj)
{
    /* ... */
    /* crashing pointer derived: view->buf */
    int in_band = 1;
    if (self->buffer_callback != NULL) {
        PyObject *ret = PyObject_CallOneArg(self->buffer_callback, obj);  /* Reentrant call site */
        if (ret == NULL) {
            return -1;
        }
        in_band = PyObject_IsTrue(ret);  /* Reentrant call site */
        Py_DECREF(ret);
        if (in_band == -1) {
            return -1;
        }
    }
    if (in_band) {
        /* Write data in-band */
        if (view->readonly) {
            return _save_bytes_data(st, self, obj, (const char *)view->buf,
                                    view->len);
        }
        else {
            return _save_bytearray_data(st, self, obj, (const char *)view->buf,
                                        view->len);
        }
    }
    /* ... */
}

/* Crash site */
static Py_ssize_t
_Pickler_Write(PicklerObject *self, const char *s, Py_ssize_t data_len)
{
    /* ... */
    memcpy(buffer + self->output_len, s, data_len);  /* Crash site */
    /* ... */
    return data_len;
}

/* Clobbering Path */
int
PyPickleBuffer_Release(PyObject *obj)
{
    PyPickleBufferObject *self = (PyPickleBufferObject *) obj;
    /* ... */
    PyBuffer_Release(&self->view);  /* state mutate site */
    return 0;
}

Sanitizer Output

=================================================================
==1950369==ERROR: AddressSanitizer: heap-use-after-free on address 0x52100002d110 at pc 0x79df96afb42e bp 0x7ffdcd025960 sp 0x7ffdcd025108
READ of size 4096 at 0x52100002d110 thread T0
    #0 0x79df96afb42d in memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors_memintrinsics.inc:115
    #1 0x79df95b52985 in memcpy /usr/include/x86_64-linux-gnu/bits/string_fortified.h:29
    #2 0x79df95b52985 in _Pickler_Write Modules/_pickle.c:1088
    #3 0x79df95b54997 in _Pickler_write_bytes Modules/_pickle.c:2381
    #4 0x79df95b555db in _save_bytearray_data Modules/_pickle.c:2505
    #5 0x79df95b55a30 in save_picklebuffer Modules/_pickle.c:2590
    #6 0x79df95b56209 in save Modules/_pickle.c:4435
    #7 0x79df95b5d92e in dump Modules/_pickle.c:4611
    #8 0x79df95b5dc65 in _pickle_Pickler_dump_impl Modules/_pickle.c:4685
    #9 0x79df95b5de40 in _pickle_Pickler_dump Modules/clinic/_pickle.c.h:73
    #10 0x58d50b4990b6 in cfunction_vectorcall_FASTCALL_KEYWORDS_METHOD Objects/methodobject.c:481
    #11 0x58d50b3e6e7f in _PyObject_VectorcallTstate Include/internal/pycore_call.h:169
    #12 0x58d50b3e6f72 in PyObject_Vectorcall Objects/call.c:327
    #13 0x58d50b665056 in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1620
    #14 0x58d50b6a8e54 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:121
    #15 0x58d50b6a9148 in _PyEval_Vector Python/ceval.c:2001
    #16 0x58d50b6a93f8 in PyEval_EvalCode Python/ceval.c:884
    #17 0x58d50b7a0507 in run_eval_code_obj Python/pythonrun.c:1365
    #18 0x58d50b7a0723 in run_mod Python/pythonrun.c:1459
    #19 0x58d50b7a157a in pyrun_file Python/pythonrun.c:1293
    #20 0x58d50b7a4220 in _PyRun_SimpleFileObject Python/pythonrun.c:521
    #21 0x58d50b7a44f6 in _PyRun_AnyFileObject Python/pythonrun.c:81
    #22 0x58d50b7f574d in pymain_run_file_obj Modules/main.c:410
    #23 0x58d50b7f59b4 in pymain_run_file Modules/main.c:429
    #24 0x58d50b7f71b2 in pymain_run_python Modules/main.c:691
    #25 0x58d50b7f7842 in Py_RunMain Modules/main.c:772
    #26 0x58d50b7f7a2e in pymain_main Modules/main.c:802
    #27 0x58d50b7f7db3 in Py_BytesMain Modules/main.c:826
    #28 0x58d50b27b645 in main Programs/python.c:15
    #29 0x79df9662a1c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #30 0x79df9662a28a in __libc_start_main_impl ../csu/libc-start.c:360
    #31 0x58d50b27b574 in _start (/home/jackfromeast/Desktop/entropy/targets/grammar-afl++-latest/targets/cpython/python+0x2dd574) (BuildId: 202d5dbb945f6d5f5a66ad50e2688d56affd6ecb)

0x52100002d110 is located 16 bytes inside of 4121-byte region [0x52100002d100,0x52100002e119)
freed by thread T0 here:
    #0 0x79df96afc4d8 in free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
    #1 0x58d50b4ad96d in _PyMem_RawFree Objects/obmalloc.c:91
    #2 0x58d50b4afcd9 in _PyMem_DebugRawFree Objects/obmalloc.c:2955
    #3 0x58d50b4afd1a in _PyMem_DebugFree Objects/obmalloc.c:3100
    #4 0x58d50b4d7348 in PyMem_Free Objects/obmalloc.c:1070
    #5 0x58d50b3c4eef in bytearray_dealloc Objects/bytearrayobject.c:1173
    #6 0x58d50b4a4481 in _Py_Dealloc Objects/object.c:3200
    #7 0x58d50b46f874 in Py_DECREF Include/refcount.h:401
    #8 0x58d50b46f893 in Py_XDECREF Include/refcount.h:511
    #9 0x58d50b47e2ff in insertdict Objects/dictobject.c:1927
    #10 0x58d50b47e67e in setitem_take2_lock_held Objects/dictobject.c:2675
    #11 0x58d50b47ee45 in _PyDict_SetItem_Take2 Objects/dictobject.c:2683
    #12 0x58d50b47eec1 in PyDict_SetItem Objects/dictobject.c:2703
    #13 0x58d50b6a1446 in _PyEval_EvalFrameDefault Python/generated_cases.c.h:11090
    #14 0x58d50b6a8e54 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:121
    #15 0x58d50b6a9148 in _PyEval_Vector Python/ceval.c:2001
    #16 0x58d50b3e69b8 in _PyFunction_Vectorcall Objects/call.c:413
    #17 0x58d50b3e6e7f in _PyObject_VectorcallTstate Include/internal/pycore_call.h:169
    #18 0x58d50b3e703f in PyObject_CallOneArg Objects/call.c:395
    #19 0x58d50b4f9648 in call_unbound_noarg Objects/typeobject.c:3040
    #20 0x58d50b514fa0 in maybe_call_special_no_args Objects/typeobject.c:3153
    #21 0x58d50b5152c2 in slot_nb_bool Objects/typeobject.c:10464
    #22 0x58d50b4a35b0 in PyObject_IsTrue Objects/object.c:2060
    #23 0x79df95b55846 in save_picklebuffer Modules/_pickle.c:2577
    #24 0x79df95b56209 in save Modules/_pickle.c:4435
    #25 0x79df95b5d92e in dump Modules/_pickle.c:4611
    #26 0x79df95b5dc65 in _pickle_Pickler_dump_impl Modules/_pickle.c:4685
    #27 0x79df95b5de40 in _pickle_Pickler_dump Modules/clinic/_pickle.c.h:73
    #28 0x58d50b4990b6 in cfunction_vectorcall_FASTCALL_KEYWORDS_METHOD Objects/methodobject.c:481
    #29 0x58d50b3e6e7f in _PyObject_VectorcallTstate Include/internal/pycore_call.h:169

previously allocated by thread T0 here:
    #0 0x79df96afd9c7 in malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
    #1 0x58d50b4ae284 in _PyMem_RawMalloc Objects/obmalloc.c:63
    #2 0x58d50b4ad655 in _PyMem_DebugRawAlloc Objects/obmalloc.c:2887
    #3 0x58d50b4b064f in _PyMem_DebugRawRealloc Objects/obmalloc.c:2963
    #4 0x58d50b4b0823 in _PyMem_DebugRealloc Objects/obmalloc.c:3108
    #5 0x58d50b4d72e7 in PyMem_Realloc Objects/obmalloc.c:1063
    #6 0x58d50b3bfc04 in bytearray_resize_lock_held Objects/bytearrayobject.c:258
    #7 0x58d50b3cd24c in PyByteArray_Resize Objects/bytearrayobject.c:278
    #8 0x58d50b3cd850 in bytearray___init___impl Objects/bytearrayobject.c:978
    #9 0x58d50b3ce2c9 in bytearray___init__ Objects/clinic/bytearrayobject.c.h:102
    #10 0x58d50b50d3c0 in type_call Objects/typeobject.c:2460
    #11 0x58d50b3e6c71 in _PyObject_MakeTpCall Objects/call.c:242
    #12 0x58d50b3e6f19 in _PyObject_VectorcallTstate Include/internal/pycore_call.h:167
    #13 0x58d50b3e6f72 in PyObject_Vectorcall Objects/call.c:327
    #14 0x58d50b665056 in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1620
    #15 0x58d50b6a8e54 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:121
    #16 0x58d50b6a9148 in _PyEval_Vector Python/ceval.c:2001
    #17 0x58d50b6a93f8 in PyEval_EvalCode Python/ceval.c:884
    #18 0x58d50b7a0507 in run_eval_code_obj Python/pythonrun.c:1365
    #19 0x58d50b7a0723 in run_mod Python/pythonrun.c:1459
    #20 0x58d50b7a157a in pyrun_file Python/pythonrun.c:1293
    #21 0x58d50b7a4220 in _PyRun_SimpleFileObject Python/pythonrun.c:521
    #22 0x58d50b7a44f6 in _PyRun_AnyFileObject Python/pythonrun.c:81
    #23 0x58d50b7f574d in pymain_run_file_obj Modules/main.c:410
    #24 0x58d50b7f59b4 in pymain_run_file Modules/main.c:429
    #25 0x58d50b7f71b2 in pymain_run_python Modules/main.c:691
    #26 0x58d50b7f7842 in Py_RunMain Modules/main.c:772
    #27 0x58d50b7f7a2e in pymain_main Modules/main.c:802
    #28 0x58d50b7f7db3 in Py_BytesMain Modules/main.c:826
    #29 0x58d50b27b645 in main Programs/python.c:15

SUMMARY: AddressSanitizer: heap-use-after-free ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors_memintrinsics.inc:115 in memcpy
Shadow bytes around the buggy address:
  0x52100002ce80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x52100002cf00: 00 00 00 00 00 fa fa fa fa fa fa fa fa fa fa fa
  0x52100002cf80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x52100002d000: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x52100002d080: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x52100002d100: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x52100002d180: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x52100002d200: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x52100002d280: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x52100002d300: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x52100002d380: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==1950369==ABORTING

Linked PRs

  • gh-143312
  • gh-143396
  • gh-143397

jackfromeast avatar Dec 31 '25 14:12 jackfromeast

I can reproduce this with the code snippet above. It looks like Py_buffer handling in _pickle.c isn't accounting for the buffer being released during the callback. I'd be happy to take a look and work on a patch, which would likely be adding a validation check for this within the save_picklebuffer function.

a12k avatar Dec 31 '25 15:12 a12k

Usually, it's simply better to increment the number of exports for the buffer. That's how we fixed past UAFs.

picnixz avatar Dec 31 '25 15:12 picnixz

@picnixz because PickleBuffer.release() explicitly signals that the buffer is released, my thinking was that raising an error would be more semantically correct here than silently keeping the memory alive.

RE: the keep-alive approach, I would implement it by calling PyObject_GetBuffer(view->obj, &temp_view, ...) to hold a temporary export on the stack and make sure PyBuffer_Release on all exit paths. That would be more complex.

a12k avatar Dec 31 '25 15:12 a12k

Oh wait, there is no exports computed here so we can't use that. Mmh, this seems trickier.

picnixz avatar Dec 31 '25 16:12 picnixz

The following patch should do the job:

diff --git a/Modules/_pickle.c b/Modules/_pickle.c
index 608598eb5a5..5cbdfd84941 100644
--- a/Modules/_pickle.c
+++ b/Modules/_pickle.c
@@ -2646,43 +2646,58 @@ save_picklebuffer(PickleState *st, PicklerObject *self, PyObject *obj)
                         "pointing to a non-contiguous buffer");
         return -1;
     }
+
+    int rc = 0;
+    Py_buffer keepalive_view;  // to ensure that 'view' is kept alive
+    if (PyObject_GetBuffer(view->obj, &keepalive_view, PyBUF_SIMPLE) != 0) {
+        return -1;
+    }
+
     int in_band = 1;
     if (self->buffer_callback != NULL) {
         PyObject *ret = PyObject_CallOneArg(self->buffer_callback, obj);
         if (ret == NULL) {
-            return -1;
+            goto error;
         }
         in_band = PyObject_IsTrue(ret);
         Py_DECREF(ret);
         if (in_band == -1) {
-            return -1;
+            goto error;
         }
     }
     if (in_band) {
         /* Write data in-band */
         if (view->readonly) {
-            return _save_bytes_data(st, self, obj, (const char *)view->buf,
-                                    view->len);
+            rc = _save_bytes_data(st, self, obj, (const char *)view->buf,
+                                  view->len);
         }
         else {
-            return _save_bytearray_data(st, self, obj, (const char *)view->buf,
-                                        view->len);
+            rc = _save_bytearray_data(st, self, obj, (const char *)view->buf,
+                                      view->len);
         }
+        goto exit;
     }
     else {
         /* Write data out-of-band */
         const char next_buffer_op = NEXT_BUFFER;
         if (_Pickler_Write(self, &next_buffer_op, 1) < 0) {
-            return -1;
+            goto error;
         }
         if (view->readonly) {
             const char readonly_buffer_op = READONLY_BUFFER;
             if (_Pickler_Write(self, &readonly_buffer_op, 1) < 0) {
-                return -1;
+                goto error;
             }
         }
     }
-    return 0;
+
+exit:
+    PyBuffer_Release(&keepalive_view);
+    return rc;
+
+error:
+    PyBuffer_Release(&keepalive_view);
+    return -1;
 }

 /* A copy of PyUnicode_AsRawUnicodeEscapeString() that also translates

picnixz avatar Dec 31 '25 16:12 picnixz

@a12k Are you working on a patch? I have other UAFs to fix but you can take my patch. You'll just need to some tests and a NEWS entry, but if you don't want to I can add it to my backlog.

picnixz avatar Dec 31 '25 16:12 picnixz

@picnixz Yeah I had re-pro'd and was starting to poke around working on a patch based on your guidance. Happy to take the above and implement. I'll make a PR with the above + tests + a news entry. Thank you!

a12k avatar Dec 31 '25 16:12 a12k

You're welcome. Also, don't forget to read https://devguide.python.org/getting-started/pull-request-lifecycle/ and https://devguide.python.org/getting-started/generative-ai/. Once your PR is ready, you can ping me.

picnixz avatar Dec 31 '25 16:12 picnixz

@picnixz The above should be nearly exactly your recommended fix above, which did indeed fix this. I say nearly because I needed to change PyBUF_SIMPLE to PyBUF_FULL_RO because PyBUF_SIMPLE seemed to be too restrictive to get existing tests to pass. It caused failures in both test_pickle.py and test_pickletools.py.

I added a couple tests as well.

Thanks for the literature recommendation!

a12k avatar Dec 31 '25 18:12 a12k

Oh yes, sorry it was a PyBUF_FULL_RO because PyPickleBuffer_FromObject also uses a PyBUF_FULL_RO.

picnixz avatar Dec 31 '25 18:12 picnixz

Closing as the PRs have been merged.

encukou avatar Jan 08 '26 10:01 encukou