pg_wait_sampling [refer #PGPRO-6599]: Avoid race conditions while processing PROFILE

…EQUEST and

PROFILE_RESET requests.

After initialization of "request" variable in collector.c main loop, another client backend could change the value of "collector_hdr->request" variable. Changing this value from "PROFILE_RESET" to "PROFILE_REQUEST" could cause deadlock: "collector" processed "PROFILE_RESET" query while client backend waits data from the "collector_mq" shared memory message queue. Now we read and write "collector_hdr->request" variable only using "PGWS_COLLECTOR_LOCK" lock.

Removed obsolete "read_current_wait()" function definition from pg_wait_sampling.h.

tags: pg_wait_sampling

Jul 04 '22 11:07 rzharkov

Changing this value from "PROFILE_RESET" to "PROFILE_REQUEST" could cause deadlock: "collector" processed "PROFILE_RESET" query while client backend waits data from the "collector_mq" shared memory message queue.

Not deadlock but infinite waiting of requested profile backend process caused by lost PROFILE_REQUEST request (if reset of request variable happened after setting of PROFILE_REQUEST value).

In general, your proposed solution (moving of request value saving under collector lock) resolves your problem. But we might end up to lost PROFILE_RESET request when PROFILE_RESET and successive some another signal arrive before collector acquires its lock. And root cause I see in asynchronous nature of PROFILE_RESET request: client backend just send signal (set latch) and release queue lock allowing other clients to perform other requests.

You might rewrite patch to include some feedback mechanics for backend performing PROFILE_RESET request, e.g., via waiting around collector's message queue, leaving the initial state of codebase (before your patch) unchanged. Or you might stick out for your solution - I figure losing of PROFILE_RESET signal as non-critical and anyway I plan to rewrite a whole mechanics of IPC communication soon. But in this case the moving of request variable reading under collector lock is enough, other changes that incorporates this variable setting on backend's side under lock looks redundant.

Aug 10 '22 19:08 maksm90

Making PROFILE_RESET request processing synchronous was not the goal of these patches. But this synchronization could be easily added. There were two problems:

deadlock (stuck, enter inconsistent state with denial of service, etc.) while processing script below

psql -c "CREATE EXTENSION pg_wait_sampling"

for j in {1..100}; do echo "iteration $j"

for i in {1..10}; do echo " SELECT * FROM pg_wait_sampling_get_profile(); SELECT pg_wait_sampling_reset_profile(); " | psql >psql$i.log & done wait

done

Coredumps while parallel processing of two "queries" regression tests with "aqo" module loaded. aqo+pgws.txt

Aug 11 '22 03:08 rzharkov

deadlock (stuck, enter inconsistent state with denial of service, etc.) while processing script below

psql -c "CREATE EXTENSION pg_wait_sampling" for j in {1..100}; do echo "iteration $j" for i in {1..10}; do echo " SELECT * FROM pg_wait_sampling_get_profile(); SELECT pg_wait_sampling_reset_profile(); " | psql >psql$i.log & done wait done

Yes, deadlock in your presented above scenario occurs when successive queries pg_wait_sampling_get_profile() try to acquire queue lock and get stuck due to another backend process that waits on message queue already holds this lock and collector that had lost PROFILE_REQUEST signal waits for new signals.

As I have mentioned, making pg_wait_sampling_reset_profile() synchronous resolves this and more race condition problems. But you might leave your solution (reorder collector lock acquiring and request variable saving on collector side) - the losing of PROFILE_RESET signals is not critical on current stage.

Coredumps while parallel processing of two "queries" regression tests with "aqo" module loaded. aqo+pgws.txt

I cannot find coredump(

Aug 11 '22 08:08 maksm90

@rzharkov could you make necessary changes - remove LockRelease() moving in receive_array() and pg_wait_sampling_reset_profile() functions? This moving is redundant as it doesn't protect collector_hdr->request shared variable but just is used to wait for collector completes its communication work.

Aug 15 '22 06:08 maksm90

Of course, I'll do it. The task is stuck, because my reviewer is on vacation :)

Aug 15 '22 06:08 rzharkov

could you make necessary changes - remove LockRelease() moving in receive_array() and pg_wait_sampling_reset_profile() functions? This moving is redundant as it doesn't protect collector_hdr->request shared variable but just is used to wait for collector completes its communication work.

No, it's not redundant. Consider the following interleaving:

S1 executed pg_wait_sampling_reset_profile and set collector_hdr->request to PROFILE_RESET.
S2 is executing receive_array. It acquires the queue lock, acquires and immediately releases the collector lock...
Now collector acquires the collector lock and reads PROFILE_RESET from S1.
S2 continues and sets collector_hdr->request to PROFILE_REQUEST.
collector continues and overwrites collector_hdr->request with NO_REQUEST and PROFILE_REQUEST is lost.

The following test script hangs sooner or later if collector_hdr->request is not protected with the lock in receive_array:

for i in `seq 1000`; do
	echo $i
	for j in `seq 10`; do
		psql <<EOF >/dev/null &
select * from pg_wait_sampling_profile;
select pg_wait_sampling_reset_profile();
EOF
	done
	wait
done

For pg_wait_sampling_reset_profile it doesn't matter, but it's better to keep both functions in sync.

Aug 30 '22 14:08 shinderuk

+extern const LOCKTAG   queueTag;
+extern const LOCKTAG   collectorTag;
...
-extern void init_lock_tag(LOCKTAG *tag, uint32 lock);

Instead of init_lock_tag that clashes with the like named symbol in aqo, we now have two other unprefixed external symbols that may clash with other modules. I think we should prefix all external symbols with "pgws_" to be safe. And it's better to do this in a separate PR. (Roman, I'm sorry for asking you to fix both problems together. Now I see that they are unrelated.) I will split the patch.

Aug 30 '22 15:08 shinderuk

Consider the following interleaving:

1. S1 executed pg_wait_sampling_reset_profile and set `collector_hdr->request` to `PROFILE_RESET`.

2. S2 is executing receive_array. It acquires the queue lock, acquires and immediately releases the collector lock...

3. Now collector acquires the collector lock and reads `PROFILE_RESET` from S1.

4. S2 continues and sets `collector_hdr->request` to `PROFILE_REQUEST`.

5. collector continues and overwrites `collector_hdr->request` with `NO_REQUEST` and `PROFILE_REQUEST` is lost.

Hmm, agreed. Free interference of collector_hdr->request setting inside receive_array() function and non-atomic manipulation with this variable inside collector imposes concurrency issues.

Approved this fix.

@rzharkov could you rewrite your commit message so that schedules causing concurrency issue would be exposes in step-by-step manner. And this message have to include two schedules (your provided and one from @shinderuk) to motivate proposed changes: reorder request keeping after collector lock acquiring in collector process and setting request variable in regular backends under this lock.

Aug 30 '22 15:08 maksm90

@maksm90 I dropped the changes related to init_lock_tag from the patch and rewrote the commit message. Please take a look. Are you satisfied with the commit message?

Aug 31 '22 12:08 shinderuk

I dropped the changes related to init_lock_tag from the patch

It looked as harmless refactoring. I would accepted patch with this fix.

I dropped the changes related to init_lock_tag from the patch and rewrote the commit message.

All right! Approved.

Aug 31 '22 13:08 maksm90

pg_wait_sampling
pg_wait_sampling copied to clipboard

[refer #PGPRO-6599]: Avoid race conditions while processing PROFILE_R…

pg_wait_sampling pg_wait_sampling copied to clipboard

[refer #PGPRO-6599]: Avoid race conditions while processing PROFILE_R…

pg_wait_sampling
pg_wait_sampling copied to clipboard