pgrx icon indicating copy to clipboard operation
pgrx copied to clipboard

Possible memory leak

Open ccleve opened this issue 2 years ago • 4 comments

I'm developing an index access method. In the ambuild method / build_callback I call a pg_extern function to process a string column that is coming from the table getting indexed.

If I hard-code the function and call it directly, there is no problem. If I store a reference to the function and call it dynamically, I get a leak. The function is a support function defined in an op class.

Here is the simplified code:

let func: &mut FmgrInfo = index_getprocinfo(indexrel, attr_num, MY_SUPPORT_PROC_NUM);
let result: Datum = FunctionCall1Coll(func, InvalidOid, val);
let tokens: Tokens = FromDatum::from_datum(result, false).unwrap();

This works, but there's a memory leak that grows to gigabytes and then goes away when indexing is done.

Large numbers of 8k blocks are getting allocated and not released. I used Instruments to track the allocations. Here's the call stack for one of the allocations:

AllocSetAlloc	
palloc	
initStringInfo	
makeStringInfo	
pgrx_pg_sys::include::pg16::makeStringInfo::_$u7b$$u7b$closure$u7d$$u7d$::ha67b83a70b8da36b	
pgrx_pg_sys::include::pg16::makeStringInfo::h52a9eb4e08c9e507	
pgrx::stringinfo::StringInfo::new::h14b298594cbdae1b	
pgrx::datum::varlena::cbor_encode::had528f965d5a8900	
pgrx::datum::varlena::_$LT$impl$u20$pgrx..datum..into..IntoDatum$u20$for$u20$T$GT$::into_datum::h8b9acf0b0f90092b	
relevantdb::pipeline::standard_pipeline::std_index_pipe_wrapper::std_index_pipe_wrapper_inner::h0b84d5a6b1c79aed	
relevantdb::pipeline::standard_pipeline::std_index_pipe_wrapper::_$u7b$$u7b$closure$u7d$$u7d$::hfe83f2098858fcdc	
std::panicking::try::do_call::hd4c8272fb472bd5d	
__rust_try	
std::panicking::try::he4259e6d44ebdbc5	
std::panic::catch_unwind::h29981eda532d524a	
pgrx_pg_sys::submodules::panic::run_guarded::h06df0a8400822769	
pgrx_pg_sys::submodules::panic::pgrx_extern_c_guard::h0e55d85a558e12fb	
std_index_pipe_wrapper	
FunctionCall1Coll	
pgrx_pg_sys::include::pg16::FunctionCall1Coll::_$u7b$$u7b$closure$u7d$$u7d$::hfd2b8f072ef60dde	
pgrx_pg_sys::include::pg16::FunctionCall1Coll::h132fdeac39f4a9fb	
relevantdb::access::build::build_callback_internal::h70bc350fd6c65d8e	
relevantdb::access::build::build_callback::build_callback_inner::h752fe11472b1aaa9	
relevantdb::access::build::build_callback::_$u7b$$u7b$closure$u7d$$u7d$::heed3800ce7d87215	
std::panicking::try::do_call::h235dc22f0d3bf6c1	
__rust_try	
std::panicking::try::h700609f12d2df316	
std::panic::catch_unwind::h416c407bf0546f94	
pgrx_pg_sys::submodules::panic::run_guarded::hee92e03a4a348986	
pgrx_pg_sys::submodules::panic::pgrx_extern_c_guard::h117fdc4c7f044bea	
relevantdb::access::build::build_callback::hcdff0de70bedcd5e	
heapam_index_build_range_scan	
relevantdb::access::build::ambuild::ambuild_inner::h1e08ba58ee209300	
relevantdb::access::build::ambuild::_$u7b$$u7b$closure$u7d$$u7d$::hf4e021fa8557db7a	
std::panicking::try::do_call::hf3cae0311d4b0801	
__rust_try	
std::panicking::try::h1485f64214f2a848	
std::panic::catch_unwind::h55b1bc33f0760e59	
pgrx_pg_sys::submodules::panic::run_guarded::hb6b6d34c504f148a	
pgrx_pg_sys::submodules::panic::pgrx_extern_c_guard::h1494dc59682cc685	
relevantdb::access::build::ambuild::hffa4ebd229238b3b	
index_build	
index_create	
DefineIndex	
ProcessUtilitySlow	
standard_ProcessUtility	
PortalRunUtility	
PortalRunMulti	
PortalRun	
exec_simple_query	
PostgresMain	
BackendRun	
BackendStartup	
PostmasterMain	
main	
start	

My support function is std_index_pipe(s: &str):

#[pg_extern(immutable, strict, parallel_safe)]
pub fn std_index_pipe(input: &str) -> Tokens {
  // make some tokens here
}

Just before I call the function I switch to a custom memory context, call the func, switch back, and reset() the context. I get the leak whether I do that or not.

I ran select * from pg_backend_memory_contexts; to see if there was a context getting filled up. The sum total_bytes across all contexts was far less than the amount of data piling up in memory. This I don't understand; palloc() should do the allocation in some memory context somewhere, right? All this memory does get cleared when the build is complete, which means that it is in a context, I just can't see it. Odd.

Any idea how to track this problem down?

ccleve avatar Oct 11 '23 18:10 ccleve

Just before I call the function I switch to a custom memory context, call the func, switch back, and reset() the context. I get the leak whether I do that or not.

Can you show us this code?

ZDB takes a similar approach in the build callback function for the same general reason and it's fine.

eeeebbbbrrrr avatar Oct 11 '23 19:10 eeeebbbbrrrr

I used code that is almost identical to that in ZDB. Literally identical.

On Wed, Oct 11, 2023, 2:47 PM Eric Ridge @.***> wrote:

Just before I call the function I switch to a custom memory context, call the func, switch back, and reset() the context. I get the leak whether I do that or not.

Can you show us this code?

ZDB takes a similar approach in the build callback function for the same general reason and it's fine.

— Reply to this email directly, view it on GitHub https://github.com/pgcentralfoundation/pgrx/issues/1330#issuecomment-1758424604, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAITHKUH4MJKZ56YDMBWBN3X63ZVVANCNFSM6AAAAAA54OJ4XU . You are receiving this because you authored the thread.Message ID: @.***>

ccleve avatar Oct 11 '23 20:10 ccleve

I used code that is almost identical to that in ZDB. Literally identical.

I'm that guy that tends to blindly believe what he reads on the internet, but in this case, can you please show us exactly what your code is doing? The code I see when I close my eyes seems to work just fine.

eeeebbbbrrrr avatar Oct 13 '23 14:10 eeeebbbbrrrr

There's a lot of intervening code. Let me make a small example that reproduces the problem.

On Fri, Oct 13, 2023 at 10:58 AM Eric Ridge @.***> wrote:

I used code that is almost identical to that in ZDB. Literally identical.

I'm that guy that tends to blindly believe what he reads on the internet, but in this case, can you please show us exactly what your code is doing? The code I see when I close my eyes seems to work just fine.

— Reply to this email directly, view it on GitHub https://github.com/pgcentralfoundation/pgrx/issues/1330#issuecomment-1761662008, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAITHKU56QFN63BFNEBASF3X7FJI5ANCNFSM6AAAAAA54OJ4XU . You are receiving this because you authored the thread.Message ID: @.***>

ccleve avatar Oct 13 '23 16:10 ccleve