plrust icon indicating copy to clipboard operation
plrust copied to clipboard

BorrowMut error when nesting calls with SPI

Open jeff-davis opened this issue 3 years ago • 2 comments

create or replace function fn1(i int) returns int strict language plrust as $$
[code]
notice!("{}", "fn1 started");
let cmd = format!("select fn2({})", i);
Spi::execute(|client|
    {
        client.select(&cmd, None, None);
    });
notice!("{}", "fn1 finished");
Some(1)
$$;

create or replace function fn2(i int) returns int strict language plrust as $$
[code]
notice!("{}", "fn2 started");
notice!("{}", "fn2 finished");
Some(2)
$$;

select fn1(1);

NOTICE:  fn1 started
ERROR:  already borrowed: BorrowMutError
CONTEXT:  src/plrust.rs:50:56
SQL statement "select fn2(1)"

jeff-davis avatar Sep 13 '22 18:09 jeff-davis

related to #78, maybe?

workingjubilee avatar Sep 13 '22 19:09 workingjubilee

Likely, but I wasn't able to confirm so I filed a separate issue.

jeff-davis avatar Sep 13 '22 19:09 jeff-davis

Updated re-create:

create or replace function fn1(i int) returns int strict language plrust as $$
[code]
notice!("{}", "fn1 started");
let cmd = format!("select fn2({})", i);
Spi::connect(|client|
    {
        client.select(&cmd, None, None);
    });
notice!("{}", "fn1 finished");
Some(1)
$$;

create or replace function fn2(i int) returns int strict language plrust as $$
[code]
notice!("{}", "fn2 started");
notice!("{}", "fn2 finished");
Some(2)
$$;

select fn1(1);

This is not related to #78. The problem is that f1() calls f2() through the executor and both are plrust functions, so we're recursing into crate::plrust:::evaluate_function(), and the LOADED_SYMBOLS static has already been mutably borrowed by the first call to f1().

We can either put LOADED_SYMBOLS behind a Arc<RwLock> or just go full unsafe and make it a static mut: HashMap. I'm actually leaning towards the latter as Postgres isn't threaded and pgx otherwise prevents a user thread from doing FFI into postgres.

EDIT: My main concern, other than correctness of course, is introducing any kind of additional overhead when executing a user function. While a RwLock would always be uncontended, there's still overhead around locking and unlocking and I suspect it would show up in profiling.

@workingjubilee, do you have any thoughts here?

eeeebbbbrrrr avatar Jan 06 '23 21:01 eeeebbbbrrrr

If we need to improve performance, I would sooner have us customize our own concurrency-sound access type than simply ripping off the safety requirements and using fully racy access... one of the things we're running into quite a lot is that while Postgres is nominally single-threaded, yes, there's other ways Postgres can be subtly concurrent and thus... annoying. There are also optimizations that Postgres makes available specifically to language handlers that we are currently not yet exploiting.

workingjubilee avatar Jan 07 '23 03:01 workingjubilee

I’m fine with a safe and correct approach.

I just wanna note that I bet it’ll come up for functions that are being called once for every row of some gigantic table.

I suppose part of my view here is if we were doing this in C it wouldn’t even be a thought. We’d have some static HTAB and just go bananas.

But I agree… this ain’t C.

eeeebbbbrrrr avatar Jan 07 '23 03:01 eeeebbbbrrrr

Also... Rust’s RwLocks aren’t re-entrant, are they? If they’re not, they’re out anyways!

Edit: I didn’t consider this earlier today. Hmm 🤔

eeeebbbbrrrr avatar Jan 07 '23 03:01 eeeebbbbrrrr

Yeah, there is... a reason I said "I would sooner have us customize our own concurrency-sound access type". We can rely on std's lock types being sound but we can't necessarily rely on them being useful, especially in the case of recursion. Deadlock would be a... valid response in this situation.

In C we would certainly be writing our own data structure here where we know the exact characteristics of its access and mutation behaviors. We may have that displeasure yet!

workingjubilee avatar Jan 07 '23 03:01 workingjubilee

Actually, I suppose with a RwLock we’d only hold it long enough to .entry().or_insert() into the symbol map. We wouldn’t hold it for the duration of the function execution. As such, it’d never re-enter and wouldn’t deadlock.

The current RefCell.with() code might could be rejiggered that way too?

eeeebbbbrrrr avatar Jan 07 '23 03:01 eeeebbbbrrrr

I've got a fix for this. PR incoming. Turned out to be super simple.

eeeebbbbrrrr avatar Jan 10 '23 20:01 eeeebbbbrrrr

see https://github.com/tcdi/plrust/pull/152

eeeebbbbrrrr avatar Jan 10 '23 20:01 eeeebbbbrrrr

fixed and released in 070b1

eeeebbbbrrrr avatar Jan 21 '23 18:01 eeeebbbbrrrr