pg_wait_sampling Add more columns/dimensions to pg_wait

Following the discussion in #94, I decided to add more columns to pg_wait_sampling_current/history/profile

I've added most of the columns from request in #94, but some fields from "pg_stat_activity" (= localBackendStatusTable) like query_text, query_start, query_id (if we actually took it from "pg_stat_activity") work in the following way: they save values at the start of the query and DO NOT clear those fields when the query has ended. So after executing "select 1;" in some client backend we will always sample its query_text and query_start until we execute other query. We specifically avoided this kind of sampling when we were making our way of tracking query_id with Executor hooks.

All new columns are added to new views with suffix "_extended". Those views COULD be changed between extension versions, while existing views without those suffixes SHOULD NOT be changed.

One more thing to highlight - since we have to look into localBackendStatusTable when we sample some columns, we have reduced perfomance. This is unavoidable and is noted in updated README. BUT, in PostgreSQL 13-16 we can't reliably link ProcGlobal and localBackendStatusTable arrays and from this was born get_beentry_by_procpid. For each interesting process from ProcGlobal we have to iterate through localBackendStatusTable. This has O(n^2) time complexity, where n is a number of all backends. Very inefficient. I probably could remake the collector code to iterate through localBackendStatusTable first and then find corresponding entries in ProcGlobal but I have not investigated it and am not sure it would be faster (it may be faster, depending on ProcGlobal access/iteration)

There could some sloppy code, including possible "you should copy the struct here, not use pointer" moments

Everyone is welcome to review the code and share their thoughts.

Mar 26 '25 10:03 Medvecrab

@Medvecrab many thanks for this.

Performance penalty is exact risk that I would like to avoid while still want to have some extra dimensions. If we add fields from PGROC isBackgroundWorker, databaseId and roleId (like here https://github.com/postgrespro/pg_wait_sampling/pull/95) we do not get increased observer effect.

So my suggestion to add fields from PGPROC to the "main" views and fields from localBackendStatusTable to _extended views

Mar 26 '25 16:03 DmitryNFomin

Performance penalty is exact risk that I would like to avoid while still want to have some extra dimensions. If we add fields from PGROC isBackgroundWorker, databaseId and roleId (like here #95) we do not get increased observer effect.

So my suggestion to add fields from PGPROC to the "main" views and fields from localBackendStatusTable to _extended views

The idea of *_extended views is to add ALL additional fields to them (also any that could be added in the future) so the original views remain the same for backwards compatibility. I agree that if we were rewriting pg_wait_sampling anew, it would make some sense to distribute columns differently, but alas. This is also the reason that with my patch we still set profiling per profile/query_id with only the old GUC variables, not with the new ones

When all fields from localBackendStatusTable are turned off (well, none of them are listed in either history_dimensions or profile_dimensions), the performance shouldn't take a hit (there are specific guards for those cases in PR)

Mar 26 '25 17:03 Medvecrab

agree with your point, just one more comment, can we add isBackgroundWorker from PGPROC? I understand that backend_type is more detailed, but it we can get it without performance penalty while we can distinct regular backends in performance analysis already

Mar 27 '25 08:03 DmitryNFomin

Seems like a fair request, will probably add after/during review

Mar 27 '25 09:03 Medvecrab

This pull request is obsoleted by #104

Oct 22 '25 07:10 Medvecrab

pg_wait_sampling
pg_wait_sampling copied to clipboard

Add more columns/dimensions to pg_wait_sampling

pg_wait_sampling pg_wait_sampling copied to clipboard

Add more columns/dimensions to pg_wait_sampling

pg_wait_sampling
pg_wait_sampling copied to clipboard