cloudberry icon indicating copy to clipboard operation
cloudberry copied to clipboard

[AQUMV]Enable answer query using Materialized View for external table.

Open avamingli opened this issue 1 year ago • 6 comments

Allow answer query using materialized views which have external or foreign tables. Since we don't know if the data is up to date of externel table outside CBDB, introduce a new GUC:

aqumv_allow_foreign_table

Let user decide if they want to use matview instead of query on external tables.

create readable external table aqumv_ext_r(id int) 
location ('demoprot://aqumvtextfile.txt') format 'text';
create materialized view aqumv_ext_mv as
  select * from aqumv_ext_r;

explain (costs off, verbose)
select * from aqumv_ext_r;
               QUERY PLAN
------------------------------------------
 Gather Motion 3:1  (slice1; segments: 3)
   Output: id
   ->  Seq Scan on aqumv.aqumv_ext_mv
         Output: id
 Optimizer: Postgres query optimizer

Index could also be used if there were on matviews.

create index on aqumv_ext_mv(id);
explain (costs off, verbose)
select * from aqumv_ext_r where id = 5;
                            QUERY PLAN
----------------------------------------------------------------------
 Gather Motion 1:1  (slice1; segments: 1)
   Output: id
   ->  Index Only Scan using aqumv_ext_mv_id_idx on aqumv.aqumv_ext_mv
         Output: id
         Index Cond: (aqumv_ext_mv.id = 5)
 Optimizer: Postgres query optimizer

fix #ISSUE_Number


Change logs

Describe your change clearly, including what problem is being solved or what feature is being added.

If it has some breaking backward or forward compatibility, please clary.

Why are the changes needed?

Describe why the changes are necessary.

Does this PR introduce any user-facing change?

If yes, please clarify the previous behavior and the change this PR proposes.

How was this patch tested?

Please detail how the changes were tested, including manual tests and any relevant unit or integration tests.

Contributor's Checklist

Here are some reminders and checklists before/when submitting your pull request, please check them:

  • [ ] Make sure your Pull Request has a clear title and commit message. You can take git-commit template as a reference.
  • [ ] Sign the Contributor License Agreement as prompted for your first-time contribution(One-time setup).
  • [ ] Learn the coding contribution guide, including our code conventions, workflow and more.
  • [ ] List your communication in the GitHub Issues or Discussions (if has or needed).
  • [ ] Document changes.
  • [ ] Add tests for the change
  • [ ] Pass make installcheck
  • [ ] Pass make -C src/test installcheck-cbdb-parallel
  • [ ] Feel free to request cloudberrydb/dev team for review and approval when your PR is ready🥳

avamingli avatar Nov 06 '24 08:11 avamingli

see more details in #693

avamingli avatar Nov 06 '24 08:11 avamingli

aqumv_allow_foreign_table

PostgreSQL-style GUC would have name like enable_XXX, huh? So, maybe enable_aqumv_foreign_table

reshke avatar Nov 06 '24 14:11 reshke

aqumv_allow_foreign_table

PostgreSQL-style GUC would have name like enable_XXX, huh? So, maybe enable_aqumv_foreign_table

Not sure.. I follow this one: allow_system_table_mods

avamingli avatar Nov 07 '24 01:11 avamingli

@my-ship-it As we have refresh fast path at #682, but for external tables we don't know the status(always up to date in gp_maview_aux). This will make REFRESH command fail to do the real thing from external data.

So, we should skip fast path for the views have external tables. That need catalog change to record if a view has external tables outside CBDB.

avamingli avatar Nov 20 '24 07:11 avamingli

@my-ship-it As we have refresh fast path at #682, but for external tables we don't know the status(always up to date in gp_maview_aux). This will make REFRESH command fail to do the real thing from external data.

So, we should skip fast path for the views have external tables. That need catalog change to record if a view has external tables outside CBDB.

Yes, I think it is a reasonable behavior. Thanks!

my-ship-it avatar Nov 21 '24 08:11 my-ship-it

@my-ship-it As we have refresh fast path at #682, but for external tables we don't know the status(always up to date in gp_maview_aux). This will make REFRESH command fail to do the real thing from external data. So, we should skip fast path for the views have external tables. That need catalog change to record if a view has external tables outside CBDB.

Yes, I think it is a reasonable behavior. Thanks!

Done in 273dae106985d139305fc79f7267ba8d8e8f507c, add has_foreign to identify a mv's Query has foreign tables. If true, we will avoid Refresh fast path no matter what the datastatus of mv is. This needs a catalog change, create a label for it.

avamingli avatar Nov 25 '24 07:11 avamingli

Rebased to resolve conflicts.

avamingli avatar Dec 04 '24 08:12 avamingli