marker icon indicating copy to clipboard operation
marker copied to clipboard

Crashed in a multi-threaded environment

Open wciq1208 opened this issue 1 year ago • 2 comments

pypdfium2==4.30.0 marker-pdf==0.2.13

I call the code as follows:

    self.model = load_all_models()
# other code
            with self._chat_lock:
                full_text, images, out_meta = convert_single_pdf(pdf_file.name, self.model, max_pages=max_pages, langs=langs, batch_multiplier=batch_multiplier, start_page=start_page)

An error occurred when triggering Python's GC:

#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=46945239307840) at ./nptl/pthread_kill.c:44 #1 __pthread_kill_internal (signo=6, threadid=46945239307840) at ./nptl/pthread_kill.c:78 #2 __GI___pthread_kill (threadid=46945239307840, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 #3 0x00002aaaaac31476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #4 0x00002aaaaac177f3 in __GI_abort () at ./stdlib/abort.c:79 #5 0x00002aaaaac78676 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x2aaaaadcab77 "%s\n") at ../sysdeps/posix/libc_fatal.c:155 #6 0x00002aaaaac8fcfc in malloc_printerr (str=str@entry=0x2aaaaadc870e "corrupted double-linked list") at ./malloc/malloc.c:5664 #7 0x00002aaaaac907cc in unlink_chunk (p=, av=0x2ab754000030) at ./malloc/malloc.c:1635 #8 0x00002aaaaac90969 in malloc_consolidate (av=av@entry=0x2ab754000030) at ./malloc/malloc.c:4780 #9 0x00002aaaaac91ea0 in _int_free (av=0x2ab754000030, p=0x2ab7546bce60, have_lock=) at ./malloc/malloc.c:4674 #10 0x00002aaaaac94453 in __GI___libc_free (mem=) at ./malloc/malloc.c:3391 #11 0x00002aab5f946488 in std::__Cr::deque<std::__Cr::unique_ptr<CPDF_ObjectWalker::SubobjectIterator, std::__Cr::default_delete<CPDF_ObjectWalker::SubobjectIterator> >, std::__Cr::allocator<std::__Cr::unique_ptr<CPDF_ObjectWalker::SubobjectIterator, std::__Cr::default_delete<CPDF_ObjectWalker::SubobjectIterator> > > >::~deque() () from /opt/conda/lib/python3.10/site-packages/pypdfium2_raw/libpdfium.so #12 0x00002aab5f946236 in CPDF_PageObjectHolder::~CPDF_PageObjectHolder() () from /opt/conda/lib/python3.10/site-packages/pypdfium2_raw/libpdfium.so #13 0x00002aab5f942d1e in CPDF_Page::~CPDF_Page() () from /opt/conda/lib/python3.10/site-packages/pypdfium2_raw/libpdfium.so #14 0x00002aaaab31d052 in ffi_call_unix64 () from /opt/conda/lib/python3.10/lib-dynload/../../libffi.so.8 #15 0x00002aaaab31b925 in ffi_call_int () from /opt/conda/lib/python3.10/lib-dynload/../../libffi.so.8 #16 0x00002aaaab31c06e in ffi_call () from /opt/conda/lib/python3.10/lib-dynload/../../libffi.so.8 #17 0x00002aaaab2fc1e7 in _call_function_pointer (argtypecount=, argcount=1, resmem=0x2ab24a4feb90, restype=, atypes=, avalues=, pProc=0x2aab5fa2c3b0 <FPDF_ClosePage>, flags=4353) at /usr/local/src/conda/python-3.10.14/Modules/_ctypes/callproc.c:916 #18 _ctypes_callproc (pProc=0x2aab5fa2c3b0 <FPDF_ClosePage>, argtuple=0x2ab24b34f340, flags=4353, argtypes=, restype=0x747720 <_Py_NoneStruct>, checker=0x0) at /usr/local/src/conda/python-3.10.14/Modules/_ctypes/callproc.c:1262 #19 0x00002aaaab30523e in PyCFuncPtr_call (self=, inargs=, kwds=0x0) at /usr/local/src/conda/python-3.10.14/Modules/_ctypes/_ctypes.c:4221 #20 0x00000000004f705b in _PyObject_MakeTpCall (tstate=0x2ab73c0608c0, callable=0x2aab5f6213c0, args=, nargs=, keywords=0x0) at /usr/local/src/conda/python-3.10.14/Objects/call.c:215 #21 0x00000000004f3106 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x2aaf5e17fb20, callable=0x2aab5f6213c0, tstate=) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:112 #22 _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x2aaf5e17fb20, callable=0x2aab5f6213c0, tstate=) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:99 #23 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x2aaf5e17fb20, callable=0x2aab5f6213c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:123 #24 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x2ab24a4fee80, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5893 #25 _PyEval_EvalFrameDefault (tstate=, f=0x2aaf5e17f9a0, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4181 #26 0x00000000004fdd4f in _PyEval_EvalFrame (throwflag=0, f=0x2aaf5e17f9a0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46 #27 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x2aab5fbb2720, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067 #28 _PyFunction_Vectorcall (func=0x2aab5fbb2710, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342 #29 0x00000000004f08a9 in do_call_core (kwdict=0x2ab50b9f0640, callargs=0x2ab24b6895c0, func=0x2aab5fbb2710, trace_info=0x2ab24a4ff040, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5945 #30 _PyEval_EvalFrameDefault (tstate=, f=0x2aaf52726440, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4277 #31 0x00000000004fdd4f in _PyEval_EvalFrame (throwflag=0, f=0x2aaf52726440, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46 #32 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x2aab5f630dd0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067 #33 _PyFunction_Vectorcall (func=0x2aab5f630dc0, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342 #34 0x00000000004f08a9 in do_call_core (kwdict=0x2ab50b9f0a00, callargs=0x2ab24a9475e0, func=0x2aab5f630dc0, trace_info=0x2ab24a4ff200, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5945 #35 _PyEval_EvalFrameDefault (tstate=, f=0x2aabc3a928c0, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4277 #36 0x00000000004f63ad in _PyEval_EvalFrame (throwflag=0, f=0x2aabc3a928c0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46 #37 _PyEval_Vector (kwnames=0x0, argcount=, args=, locals=0x0, con=0x2aaaab149d90, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067 #38 _PyFunction_Vectorcall (kwnames=0x0, nargsf=, stack=, func=0x2aaaab149d80) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342 #39 _PyObject_FastCallDictTstate (tstate=0x2ab73c0608c0, callable=0x2aaaab149d80, args=, nargsf=, kwargs=) at /usr/local/src/conda/python-3.10.14/Objects/call.c:142 #40 0x0000000000507b36 in _PyObject_Call_Prepend (tstate=0x2ab73c0608c0, callable=0x2aaaab149d80, obj=0x2ab48fe2ca80, args=, kwargs=0x0) at /usr/local/src/conda/python-3.10.14/Objects/call.c:431 #41 0x00000000005cf913 in slot_tp_call (self=0x2ab48fe2ca80, args=0x2ab24b44d6c0, kwds=0x0) at /usr/local/src/conda/python-3.10.14/Objects/typeobject.c:7494 #42 0x00000000004f705b in _PyObject_MakeTpCall (tstate=0x2ab73c0608c0, callable=0x2ab48fe2ca80, args=, nargs=, keywords=0x0) at /usr/local/src/conda/python-3.10.14/Objects/call.c:215 #43 0x000000000059860a in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=9223372036854775809, args=0x2ab24a4ff458, callable=0x2ab48fe2ca80, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:112 #44 PyObject_CallOneArg (func=0x2ab48fe2ca80, arg=) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:184 #45 0x00000000004e19c9 in handle_weakrefs (old=0x75b6d0, unreachable=0x2ab24a4ff520) at /usr/local/src/conda/python-3.10.14/Modules/gcmodule.c:887 #46 gc_collect_main (tstate=0x2ab73c0608c0, generation=2, n_collected=0x2ab24a4ff600, n_uncollectable=0x2ab24a4ff5f8, nofail=0) at /usr/local/src/conda/python-3.10.14/Modules/gcmodule.c:1281 #47 0x000000000059168c in gc_collect_with_callback (tstate=tstate@entry=0x2ab73c0608c0, generation=2) at /usr/local/src/conda/python-3.10.14/Modules/gcmodule.c:1413 #48 0x00000000004d789a in gc_collect_generations (tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Modules/gcmodule.c:1468 #49 _PyObject_GC_Alloc (basicsize=, use_calloc=0) at /usr/local/src/conda/python-3.10.14/Modules/gcmodule.c:2297 #50 _PyObject_GC_Malloc (basicsize=) at /usr/local/src/conda/python-3.10.14/Modules/gcmodule.c:2307 #51 _PyObject_GC_New (tp=0x749da0 <PyDict_Type>) at /usr/local/src/conda/python-3.10.14/Modules/gcmodule.c:2319 #52 0x00000000004d8d5a in new_dict (values=0x759048 <empty_values>, keys=0x749d60 <empty_keys_struct>) at /usr/local/src/conda/python-3.10.14/Objects/dictobject.c:663 #53 PyDict_New () at /usr/local/src/conda/python-3.10.14/Objects/dictobject.c:745 #54 0x00002aab129d56e6 in _parse_object_unicode (next_idx_ptr=, idx=50500, pystr=0x2ab740437170, s=0x2aab5a4088e0) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:704 #55 scan_once_unicode (s=0x2aab5a4088e0, pystr=0x2ab740437170, idx=50499, next_idx_ptr=) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:1064 #56 0x00002aab129d55b9 in _parse_array_unicode (next_idx_ptr=, idx=, pystr=0x2ab740437170, s=0x2aab5a4088e0) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:841 #57 scan_once_unicode (s=0x2aab5a4088e0, pystr=0x2ab740437170, idx=, next_idx_ptr=) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:1072 --Type <RET> for more, q to quit, c to continue without paging-- #58 0x00002aab129d583f in _parse_object_unicode (next_idx_ptr=, idx=, pystr=0x2ab740437170, s=0x2aab5a4088e0) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:743 #59 scan_once_unicode (s=0x2aab5a4088e0, pystr=0x2ab740437170, idx=, next_idx_ptr=) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:1064 #60 0x00002aab129d55b9 in _parse_array_unicode (next_idx_ptr=, idx=, pystr=0x2ab740437170, s=0x2aab5a4088e0) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:841 #61 scan_once_unicode (s=0x2aab5a4088e0, pystr=0x2ab740437170, idx=, next_idx_ptr=) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:1072 #62 0x00002aab129d583f in _parse_object_unicode (next_idx_ptr=, idx=, pystr=0x2ab740437170, s=0x2aab5a4088e0) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:743 #63 scan_once_unicode (s=0x2aab5a4088e0, pystr=0x2ab740437170, idx=, next_idx_ptr=) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:1064 #64 0x00002aab129d55b9 in _parse_array_unicode (next_idx_ptr=, idx=, pystr=0x2ab740437170, s=0x2aab5a4088e0) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:841 #65 scan_once_unicode (s=0x2aab5a4088e0, pystr=0x2ab740437170, idx=, next_idx_ptr=) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:1072 #66 0x00002aab129d583f in _parse_object_unicode (next_idx_ptr=, idx=, pystr=0x2ab740437170, s=0x2aab5a4088e0) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:743 #67 scan_once_unicode (s=0x2aab5a4088e0, pystr=0x2ab740437170, idx=, next_idx_ptr=) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:1064 #68 0x00002aab129d55b9 in _parse_array_unicode (next_idx_ptr=, idx=, pystr=0x2ab740437170, s=0x2aab5a4088e0) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:841 #69 scan_once_unicode (s=0x2aab5a4088e0, pystr=0x2ab740437170, idx=, next_idx_ptr=) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:1072 #70 0x00002aab129d583f in _parse_object_unicode (next_idx_ptr=, idx=, pystr=0x2ab740437170, s=0x2aab5a4088e0) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:743 #71 scan_once_unicode (s=0x2aab5a4088e0, pystr=0x2ab740437170, idx=, next_idx_ptr=) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:1064 #72 0x00002aab129d4c48 in scanner_call (self=0x2aab5a4088e0, args=, kwds=) at /usr/local/src/conda/python-3.10.14/Modules/_json.c:1149 #73 0x00000000004f705b in _PyObject_MakeTpCall (tstate=0x2ab73c0608c0, callable=0x2aab5a4088e0, args=, nargs=, keywords=0x0) at /usr/local/src/conda/python-3.10.14/Objects/call.c:215 #74 0x00000000004f3106 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x2ab6e5f706d0, callable=0x2aab5a4088e0, tstate=) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:112 #75 _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x2ab6e5f706d0, callable=0x2aab5a4088e0, tstate=) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:99 #76 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x2ab6e5f706d0, callable=0x2aab5a4088e0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:123 #77 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x2ab24a4ffc60, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5893 #78 _PyEval_EvalFrameDefault (tstate=, f=0x2ab6e5f70530, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4181 #79 0x00000000005095ce in _PyEval_EvalFrame (throwflag=0, f=0x2ab6e5f70530, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46 #80 _PyEval_Vector (kwnames=, argcount=, args=0x2ab7507cd8b8, locals=0x0, con=0x2aab5a3c35c0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067 #81 _PyFunction_Vectorcall (kwnames=, nargsf=, stack=0x2ab7507cd8b8, func=0x2aab5a3c35b0) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342 #82 _PyObject_VectorcallTstate (kwnames=, nargsf=, args=0x2ab7507cd8b8, callable=0x2aab5a3c35b0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114 #83 method_vectorcall (method=, args=0x2ab7507cd8c0, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/classobject.c:53 #84 0x00000000004ef0e3 in _PyObject_VectorcallTstate (kwnames=0x2aab5a3db430, nargsf=, args=, callable=0x2aafa1a2f640, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114 #85 PyObject_Vectorcall (kwnames=0x2aab5a3db430, nargsf=, args=, callable=0x2aafa1a2f640) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:123 #86 call_function (kwnames=0x2aab5a3db430, oparg=, pp_stack=, trace_info=0x2ab24a4ffe70, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5893 #87 _PyEval_EvalFrameDefault (tstate=, f=0x2ab7507cd730, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4231 #88 0x00000000004fdd4f in _PyEval_EvalFrame (throwflag=0, f=0x2ab7507cd730, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46 #89 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x2aab5a3c3530, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067 #90 _PyFunction_Vectorcall (func=0x2aab5a3c3520, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342 #91 0x00000000004ee461 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x2ab760c7b2e8, callable=0x2aab5a3c3520, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114 #92 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x2ab760c7b2e8, callable=0x2aab5a3c3520) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:123 #93 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x2ab24a500030, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5893 #94 _PyEval_EvalFrameDefault (tstate=, f=0x2ab760c7b140, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4198 #95 0x00000000004fdd4f in _PyEval_EvalFrame (throwflag=0, f=0x2ab760c7b140, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46 #96 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x2aab5a3c3c80, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067 #97 _PyFunction_Vectorcall (func=0x2aab5a3c3c70, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342 #98 0x00000000004f08a9 in do_call_core (kwdict=0x2aafa1a23f00, callargs=0x2ab24b44c400, func=0x2aab5a3c3c70, trace_info=0x2ab24a5001f0, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5945 #99 _PyEval_EvalFrameDefault (tstate=, f=0x2aabc3a76510, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4277 #100 0x00000000004fdd4f in _PyEval_EvalFrame (throwflag=0, f=0x2aabc3a76510, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46 #101 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x2aab5cd65490, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067 #102 _PyFunction_Vectorcall (func=0x2aab5cd65480, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342 #103 0x00000000004ee461 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x2ab73c4f0310, callable=0x2aab5cd65480, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114 #104 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x2ab73c4f0310, callable=0x2aab5cd65480) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:123 #105 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x2ab24a5003b0, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5893 #106 _PyEval_EvalFrameDefault (tstate=, f=0x2ab73c4f0180, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4198 #107 0x00000000004fdd4f in _PyEval_EvalFrame (throwflag=0, f=0x2ab73c4f0180, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46 #108 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x2aab9595fad0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067 #109 _PyFunction_Vectorcall (func=0x2aab9595fac0, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342 #110 0x00000000004ee461 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x2ab73c008968, callable=0x2aab9595fac0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114 #111 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x2ab73c008968, callable=0x2aab9595fac0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:123 #112 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x2ab24a500570, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5893 #113 _PyEval_EvalFrameDefault (tstate=, f=0x2ab73c0087c0, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4198 #114 0x00000000004fdd4f in _PyEval_EvalFrame (throwflag=0, f=0x2ab73c0087c0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46 #115 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x2aabc3a49760, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067 #116 _PyFunction_Vectorcall (func=0x2aabc3a49750, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342 #117 0x00000000004ee461 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x2aabc3ab6550, callable=0x2aabc3a49750, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114 --Type <RET> for more, q to quit, c to continue without paging-- #118 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x2aabc3ab6550, callable=0x2aabc3a49750) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:123 #119 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x2ab24a500730, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5893 #120 _PyEval_EvalFrameDefault (tstate=, f=0x2aabc3ab63e0, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4198 #121 0x0000000000509857 in _PyEval_EvalFrame (throwflag=0, f=0x2aabc3ab63e0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46 #122 _PyEval_Vector (kwnames=0x0, argcount=1, args=0x2ab24a500818, locals=0x0, con=0x2aabc3a495b0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067 #123 _PyFunction_Vectorcall (kwnames=0x0, nargsf=1, stack=0x2ab24a500818, func=0x2aabc3a495a0) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342 #124 _PyObject_VectorcallTstate (kwnames=0x0, nargsf=1, args=0x2ab24a500818, callable=0x2aabc3a495a0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114 #125 method_vectorcall (method=, args=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/classobject.c:61 #126 0x00000000004f08a9 in do_call_core (kwdict=0x2ab24a2a7040, callargs=0x2aaaaae78070, func=0x2aaf53f6b080, trace_info=0x2ab24a500940, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5945 #127 _PyEval_EvalFrameDefault (tstate=, f=0x2aabc3ab67a0, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4277 #128 0x00000000004fdd4f in _PyEval_EvalFrame (throwflag=0, f=0x2aabc3ab67a0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46 #129 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x2aaaab161370, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067 #130 _PyFunction_Vectorcall (func=0x2aaaab161360, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342 #131 0x00000000004ee461 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x2aaf52c618f0, callable=0x2aaaab161360, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114 #132 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x2aaf52c618f0, callable=0x2aaaab161360) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:123 #133 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x2ab24a500b00, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5893 #134 _PyEval_EvalFrameDefault (tstate=, f=0x2aaf52c61780, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4198 #135 0x00000000004fdd4f in _PyEval_EvalFrame (throwflag=0, f=0x2aaf52c61780, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46 #136 _PyEval_Vector (kwnames=, argcount=, args=, locals=0x0, con=0x2aaaab161640, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067 #137 _PyFunction_Vectorcall (func=0x2aaaab161630, stack=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342 #138 0x00000000004ee461 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=, args=0x2aaf5284d470, callable=0x2aaaab161630, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114 #139 PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x2aaf5284d470, callable=0x2aaaab161630) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:123 #140 call_function (kwnames=0x0, oparg=, pp_stack=, trace_info=0x2ab24a500cc0, tstate=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5893 #141 _PyEval_EvalFrameDefault (tstate=, f=0x2aaf5284d300, throwflag=) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:4198 #142 0x0000000000509857 in _PyEval_EvalFrame (throwflag=0, f=0x2aaf5284d300, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46 #143 _PyEval_Vector (kwnames=0x0, argcount=1, args=0x2ab24a500da8, locals=0x0, con=0x2aaaab161400, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Python/ceval.c:5067 #144 _PyFunction_Vectorcall (kwnames=0x0, nargsf=1, stack=0x2ab24a500da8, func=0x2aaaab1613f0) at /usr/local/src/conda/python-3.10.14/Objects/call.c:342 #145 _PyObject_VectorcallTstate (kwnames=0x0, nargsf=1, args=0x2ab24a500da8, callable=0x2aaaab1613f0, tstate=0x2ab73c0608c0) at /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114 #146 method_vectorcall (method=, args=, nargsf=, kwnames=) at /usr/local/src/conda/python-3.10.14/Objects/classobject.c:61 #147 0x00000000005e5dd5 in thread_run (boot_raw=0x2aabc3aba940) at /usr/local/src/conda/python-3.10.14/Modules/_threadmodule.c:1100 #148 0x00000000005e5d34 in pythread_wrapper (arg=) at /usr/local/src/conda/python-3.10.14/Python/thread_pthread.h:248 #149 0x00002aaaaac83ac3 in start_thread (arg=) at ./nptl/pthread_create.c:442 #150 0x00002aaaaad14a04 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100

wciq1208 avatar Jul 09 '24 03:07 wciq1208

Those hyperlinks cause all kinds of mentioning-pollution. Can you put the log in between a 'code' block?

image

rmast avatar Aug 25 '24 19:08 rmast

I got the same question: my server code: image error: image

HaoRenkk123 avatar Sep 11 '24 07:09 HaoRenkk123

That's expected, as one of marker's dependencies (pypdfium2/pdfium) is not thread-compatible: It is the caller's responsibility to use locks or similar to prevent threaded access of pdfium's APIs. https://pypdfium2.readthedocs.io/en/stable/python_api.html#thread-incompatibility

mara004 avatar Oct 08 '24 18:10 mara004

That's expected, as one of marker's dependencies (pypdfium2/pdfium) is not thread-compatible: It is the caller's responsibility to use locks or similar to prevent threaded access of pdfium's APIs. https://pypdfium2.readthedocs.io/en/stable/python_api.html#thread-incompatibility

I have already used a thread lock during parsing, but the core dump occurred during GC

wciq1208 avatar Oct 10 '24 02:10 wciq1208

I have already used a thread lock during parsing, but the core dump occurred during GC

Hmm, pypdfium2 auto-closes pdfium objects on garbage collection using weakref.finalize(), if the caller did not close explicitly. In the above backtrace, there is an FPDF_ClosePage() call on #17. The question is, who causes simultaneous pdfium calls during the GC phase, and how can we prevent that? Or is there some prior corruption that causes the close call to fail?

A possible workaround/test might be to add explicit close calls to all pdfium root objects throughout the dependencies and see if that fixes the issue.

Unfortunately, threading/GC-related issues are hard to debug.

mara004 avatar Oct 10 '24 14:10 mara004

I had a similar error with the corrupted double-linked list (Colab A100). Running this code fixed the problem for me:

from threading import RLock
from contextlib import contextmanager
import logging
import time
import traceback

class SafeLock:
    def __init__(self, name="SafeLock", timeout=60, max_retries=3, retry_delay=1):
        self._lock = RLock()  # Reentrant lock is safer than regular Lock
        self.name = name
        self.timeout = timeout
        self.max_retries = max_retries
        self.retry_delay = retry_delay
        self.logger = logging.getLogger(__name__)

    @contextmanager
    def acquire_safely(self):
        attempt = 0
        while attempt < self.max_retries:
            try:
                acquired = self._lock.acquire(timeout=self.timeout)
                if acquired:
                    try:
                        yield
                    except Exception as e:
                        self.logger.error(f"Error while holding lock: {str(e)}\n{traceback.format_exc()}")
                        raise
                    finally:
                        try:
                            self._lock.release()
                        except Exception as e:
                            self.logger.error(f"Error releasing lock: {str(e)}")
                    return
                else:
                    attempt += 1
                    self.logger.warning(
                        f"Failed to acquire lock {self.name} (attempt {attempt}/{self.max_retries})"
                    )
                    time.sleep(self.retry_delay)
            except Exception as e:
                attempt += 1
                self.logger.error(f"Lock acquisition error: {str(e)}")
                time.sleep(self.retry_delay)
                
        raise TimeoutError(f"Failed to acquire {self.name} after {self.max_retries} attempts")

safe_lock = SafeLock(name="PDFConverter", timeout=120, max_retries=5, retry_delay=2)
fpath = "/path/to/pdf/file.pdf"
with safe_lock.acquire_safely():
    try:
        full_text, images, out_meta = convert_single_pdf(fpath, model_lst)
    except Exception as e:
        logging.error(f"PDF conversion error: {str(e)}\n{traceback.format_exc()}")
        raise

aj8907 avatar Nov 26 '24 17:11 aj8907

@aj8907 Sorry, I'm not much into threading, but I don't logically see how this is supposed to fix the above issue? Why is a plain RLock not sufficient?

The question is, who causes simultaneous pdfium calls during the GC phase, and how can we prevent that? Or is there some prior corruption that causes the close call to fail?

If the cause is indeed simultaneous calls due to GC, and not other caller-caused corruption, I figured we may be able to add an API to plug in a caller-provided lock into our auto-close machinery. @wciq1208, or anyone else affected: The pre-requisite for me to work on this would be a minimal reproducible example (the snippet in the initial post is incomplete).

mara004 avatar Nov 26 '24 18:11 mara004

@aj8907 Sorry, I'm not much into threading, but I don't logically see how this is supposed to fix the above issue? And why is a bare RLock not sufficient?

The question is, who causes simultaneous pdfium calls during the GC phase, and how can we prevent that? Or is there some prior corruption that causes the close call to fail?

If the cause is indeed simultaneous calls due to GC, and not other caller-caused corruption, I figured we may be able to add an API to plug in a caller-provided lock into our auto-close machinery. @wciq1208, or anyone else affected: The pre-requisite for me to work on this would be a minimal reproducible example (the snippet in the initial post is incomplete).

I have switched from multithreading to multiprocessing for inference, and my project version is currently at 0.2.17. Since the project seems to be undergoing a restructuring for version 2, I don't need a solution to this issue for now.

wciq1208 avatar Nov 27 '24 02:11 wciq1208

v1.6.2 corrupted double-linked list Fatal Python error: Aborted

Current thread 0x0000753f4d600640 (most recent call first): File "/home/ /pywork/test/.venv/lib/python3.11/site-packages/pypdfium2/_helpers/document.py", line 104 in _close_impl File "/home/ /pywork/test/.venv/lib/python3.11/site-packages/pypdfium2/internal/bases.py", line 44 in _close_template File "/home/ /.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/lib/python3.11/weakref.py", line 590 in call File "/home/ /pywork/test/.venv/lib/python3.11/site-packages/pypdfium2/internal/bases.py", line 102 in close File "/home/ /pywork/test/.venv/lib/python3.11/site-packages/marker/providers/pdf.py", line 118 in get_doc File "/home/ /.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/lib/python3.11/contextlib.py", line 144 in exit File "/home/ /pywork/test/.venv/lib/python3.11/site-packages/marker/providers/pdf.py", line 403 in get_images File "/home/ /pywork/test/.venv/lib/python3.11/site-packages/marker/builders/document.py", line 41 in build_document File "/home/ /pywork/test/.venv/lib/python3.11/site-packages/marker/builders/document.py", line 32 in call File "/home/ /pywork/test/.venv/lib/python3.11/site-packages/marker/converters/pdf.py", line 144 in build_document File "/home/ /pywork/test/.venv/lib/python3.11/site-packages/marker/converters/pdf.py", line 154 in call

how can i resolve the problem?

eleking328 avatar Apr 29 '25 10:04 eleking328

how can i resolve the problem?

You could try patching your lock into pypdfium2's _close_template function and see if that fixes the crash. Again, a minimal reproducible example would be helpful.

mara004 avatar Apr 29 '25 13:04 mara004