citus icon indicating copy to clipboard operation
citus copied to clipboard

Harden columnar public views against cross-session temp relations

Open m3hm3t opened this issue 2 months ago • 3 comments

fixes #8249

On PostgreSQL 18, opening another backend’s temporary relation is rejected earlier. Our public columnar views (columnar.storage, and those that join through it) could traverse other sessions’ temp tables and error with cannot access temporary tables of other sessions. Also, direct callers of columnar.get_storage_id(regclass) could hit the same path on PG18.

What changed

  • SQL migration: citus_columnar--13.2-1--14.0-1.sql

    • Recreate columnar.storage with a temp-safe filter:

      • include permanent/unlogged relations,
      • include this session’s temp relations,
      • exclude other sessions’ temp schemas.
    • Re-emit columnar.stripe, columnar.chunk_group, and columnar.chunk with OR REPLACE so they inherit the safer storage and keep security_barrier.

  • C code: columnar_metadata.c

    • Add a PG18-only precheck in columnar.get_storage_id(regclass) to skip other sessions’ temp relations before relation_open(). For those, the function returns NULL (SQL is STRICT, so callers naturally drop such rows). Behavior on PG15–PG17 is unchanged.

Downgrade note

We intentionally keep the hardened view definitions after downgrade as a bugfix (safer on all supported PGs).

I couldn't find related commit but found these discussions:

https://www.postgresql.org/message-id/2736425.1758475979%40sss.pgh.pa.us https://www.postgresql.org/message-id/CAMEv5_syU0ZopE-2Wr8A8QksqrCyYT2hW06Rgw4RSPdyJO-%3Dfw%40mail.gmail.com https://www.postgresql.org/message-id/flat/CAJDiXghdFcZ8=nh4G69te7iRr3Q0uFyXxb3ZdG09_GTNZXwH0g@mail.gmail.com

m3hm3t avatar Oct 14 '25 13:10 m3hm3t

Codecov Report

:x: Patch coverage is 77.77778% with 2 lines in your changes missing coverage. Please review. :white_check_mark: Project coverage is 88.99%. Comparing base (356f254) to head (5fd40f9).

Additional details and impacted files
@@                    Coverage Diff                     @@
##           m3hm3t/pg18_beta_confs    #8252      +/-   ##
==========================================================
+ Coverage                   86.65%   88.99%   +2.34%     
==========================================================
  Files                         287      287              
  Lines                       63189    63200      +11     
  Branches                     7949     7951       +2     
==========================================================
+ Hits                        54756    56245    +1489     
+ Misses                       5931     4635    -1296     
+ Partials                     2502     2320     -182     
:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Oct 14 '25 13:10 codecov[bot]

Instead, let's fix the columnar views -the views inside columnar schema, not the tables in columnar_internal- in a way that they skip other sessions' temp relations always.

For two reasons;

  1. seems these are not just regression test issues -> users can also query those views
  2. fixing these issues in the catalog view level should automatically fix most of those regression tests

I can imagine a few tests directly executing get_storage_id() and those tests should be updated in a way that they stop directly calling the UDF but they should query columnar.storage as in select storage_id from columnar.storage where relation = ... This should help once we fix columnar.storage in a way that it omits the temp relations of other sessions.

onurctirtir avatar Oct 16 '25 09:10 onurctirtir

Instead, let's fix the columnar views -the views inside columnar schema, not the tables in columnar_internal- in a way that they skip other sessions' temp relations always.

For two reasons;

  1. seems these are not just regression test issues -> users can also query those views
  2. fixing these issues in the catalog view level should automatically fix most of those regression tests

I can imagine a few tests directly executing get_storage_id() and those tests should be updated in a way that they stop directly calling the UDF but they should query columnar.storage as in select storage_id from columnar.storage where relation = ... This should help once we fix columnar.storage in a way that it omits the temp relations of other sessions.

@onurctirtir thank you for feedback

I updated the PR according to your suggestion. If you have time, could you review the PR?

m3hm3t avatar Oct 16 '25 12:10 m3hm3t