gpdb icon indicating copy to clipboard operation
gpdb copied to clipboard

REGRESSION: Queries that works in releases prior to 7, fails in 7

Open lmugnano4537 opened this issue 1 year ago • 6 comments
trafficstars

Bug Report


  1. This is a simplified part of a larger query being used to find append-optimized tables that need to be vacuumed. In releases prior to Greenplum 7 the following query works:
select gp_toolkit.__gp_aovisimap_hidden_info(a.relid) from pg_appendonly a
;

In Greenplum 7, the above query fails with error:

ERROR:  function with EXECUTE ON restrictions cannot be used in the SELECT list of a query with FROM

  1. This is a query that shows compaction information for a AO table:
select (gp_toolkit.__gp_aovisimap_compaction_info(c.oid)).* from pg_class c join pg_namespace n on n.oid = c.relnamespace where nspname = lower('dbamonitor') and relname = lower('masterlog')
;

This query works fine in releases < 7 but fails in 7.1 with error:

ERROR:  function not supported on relation  (seg1 slice1 172.17.0.2:6001 pid=93894)
CONTEXT:  SQL function "__gp_aovisimap_hidden_typed" statement 1
PL/pgSQL function gp_toolkit.__gp_aovisimap_compaction_info(oid) line 7 at FOR over SELECT rows

I don't see anything in the release notes that indicates what breaks these queries.

Please document a workaround for queries that are going to fail because of whatever change is causing these to fail or fix the regression so these queries work like they did prior to 7.

Greenplum version or build

7.1

OS version and uname -a

Redhat 8.7

autoconf options used ( config.status --config )

Installation information ( pg_config )

Expected behavior

Query should not return an error and work like it has in prior releases

Actual behavior

Fails with unclear error with no clear indication on how to actually resolve the error.

Step to reproduce the behavior

Run the above query in Greenplum 7, run the same query in Greenplum 6. Compare results

lmugnano4537 avatar Apr 10 '24 01:04 lmugnano4537

For 1:

In GPDB6 doc https://docs.vmware.com/en/VMware-Greenplum/6/greenplum-database/ref_guide-sql_commands-CREATE_FUNCTION.html?hWord=N4IghgNiBcIMYCcCmYAuSAEAzArgOzlQEsB7PEAXyA

the section Using EXECUTE ON attributes has detailed description.

GP6 don't match the document due to some bugs in the code (forget to check something). GP7 fixes them.


For 2:

What is the pg_class entry of the rel dbamonitor.masterlog?

kainwen avatar Apr 14 '24 15:04 kainwen

For #1, that description is in the create function statement but the function I’m using isn’t one I wrote, it’s a function delivered by the product in the gp_toolkit schema so more pertinent to the developer who wrote the function vs the user of said function. Bottom line, either way, works in every release prior to 7 and fails in gpdb7 with no documented alternatives (again, the function is delivered by the product).

For #2, dbamonitor.masterlog is just one example of a appendoptimized table, it fails with any AO table

On Sun, Apr 14, 2024 at 11:39 AM Zhenghua Lyu @.***> wrote:

For 1:

In GPDB6 doc https://docs.vmware.com/en/VMware-Greenplum/6/greenplum-database/ref_guide-sql_commands-CREATE_FUNCTION.html?hWord=N4IghgNiBcIMYCcCmYAuSAEAzArgOzlQEsB7PEAXyA

the section Using EXECUTE ON attributes has detailed description.

GP6 don't match the document due to some bugs in the code (forget to check something). GP7 fixes them.

For 2:

What is the pg_class entry of the rel dbamonitor.masterlog?

— Reply to this email directly, view it on GitHub https://github.com/greenplum-db/gpdb/issues/17331#issuecomment-2054101044, or unsubscribe https://github.com/notifications/unsubscribe-auth/BDATK4PCNLW5BN537FD6ALLY5KPLRAVCNFSM6AAAAABF7SEAE2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJUGEYDCMBUGQ . You are receiving this because you authored the thread.Message ID: @.***>

lmugnano4537 avatar Apr 14 '24 16:04 lmugnano4537

Actually for #2, my statement might not be correct, dbamonitor.masterlog is not just any AO table, it’s the parent table of a partitioned table which might be part of the problem. I’ll try it with a non-partitioned AO table and see if the same query fails. I’m offline now but will try it and let you know. Original query though still worked in releases prior to 7

On Sun, Apr 14, 2024 at 12:01 PM Louis Mugnano @.***> wrote:

For #1, that description is in the create function statement but the function I’m using isn’t one I wrote, it’s a function delivered by the product in the gp_toolkit schema so more pertinent to the developer who wrote the function vs the user of said function. Bottom line, either way, works in every release prior to 7 and fails in gpdb7 with no documented alternatives (again, the function is delivered by the product).

For #2, dbamonitor.masterlog is just one example of a appendoptimized table, it fails with any AO table

On Sun, Apr 14, 2024 at 11:39 AM Zhenghua Lyu @.***> wrote:

For 1:

In GPDB6 doc https://docs.vmware.com/en/VMware-Greenplum/6/greenplum-database/ref_guide-sql_commands-CREATE_FUNCTION.html?hWord=N4IghgNiBcIMYCcCmYAuSAEAzArgOzlQEsB7PEAXyA

the section Using EXECUTE ON attributes has detailed description.

GP6 don't match the document due to some bugs in the code (forget to check something). GP7 fixes them.

For 2:

What is the pg_class entry of the rel dbamonitor.masterlog?

— Reply to this email directly, view it on GitHub https://github.com/greenplum-db/gpdb/issues/17331#issuecomment-2054101044, or unsubscribe https://github.com/notifications/unsubscribe-auth/BDATK4PCNLW5BN537FD6ALLY5KPLRAVCNFSM6AAAAABF7SEAE2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJUGEYDCMBUGQ . You are receiving this because you authored the thread.Message ID: @.***>

lmugnano4537 avatar Apr 14 '24 16:04 lmugnano4537

Ok, for query #2, the issue does only occur for parent tables of a partitioned table that is AO. That goes back to another issue I opened where it looks like AO parent partitions no longer get represented as AO (which I guess technically they aren't so even though it's a regression / difference from gpdb6 it may be more accurate).

It works for a plain AO table:

gpadmin=# select (gp_toolkit.__gp_aovisimap_compaction_info(c.oid)).* from pg_class c join pg_namespace n on n.oid = c.relnamespace where nspname = lower('dbamonitor') and relname = lower('orphan_files') ; NOTICE: gp_appendonly_compaction_threshold = 10 content | datafile | compaction_possible | hidden_tupcount | total_tupcount | percent_hidden ---------+----------+---------------------+-----------------+----------------+---------------- (0 rows)

gpadmin=# \d+ dbamonitor.orphan_files Table "dbamonitor.orphan_files" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description ----------+-----------------------+-----------+----------+---------+----------+--------------+------------- database | character varying(25) | | | | extended | | content | integer | | | | plain | | ts | text | | | | extended | | filename | text | | | | extended | | filepath | text | | | | extended | | Distributed randomly Access method: ao_row Options: compresstype=zlib, compresslevel=5, blocksize=32768, checksum=true

lmugnano4537 avatar Apr 15 '24 00:04 lmugnano4537

To simplify the requirement for query #1, let me just state the use case and maybe there is a new/better way to do it in Greenplum 7 that doesn't require me using the function that no longer works. The requirement is very simple:

I need to efficiently find the list of AO tables in the database that need to be vacuumed. The way I was doing this before Greenplum 7 was using this function to get the "hidden" rows and comparing that against the total rows to determine, based on percentage hidden, that I should vacuum.

That's the requirement, is there a way to do this in Greenplum 7 that is efficient (ie, not looping through individual table by table), keep in mind that many large customers have 100,000's AO table/partitions

lmugnano4537 avatar Apr 15 '24 15:04 lmugnano4537

Regarding #1, I found a workaround to accomplish what I need, if I use a different function from gp_toolkit that seems to return the exact same information I need works fine (which is even more confusing to users) so there is some difference in how these 2 similar functions were written by Greenplum R&D which makes one work and one not in GPDB7:

This query works fine: select (gp_toolkit.__gp_aovisimap_compaction_info(c.relid)).total_tupcount from pg_appendonly c;

Since I only need the total_tupcount column I can use this function instead of the other function which fails

lmugnano4537 avatar Apr 16 '24 14:04 lmugnano4537