Pyrseas
Pyrseas copied to clipboard
dbtoyaml - analyze functions arguments and returns type
Would be very nice to have information about function arguments and types.
For instance
create table foo(
a int unique
);
create table bar(
b int primary key references foo(a)
);
create or replace function f(value foo, c int) returns bar as $$
SELECT ROW(value.*)::bar;
$$ language sql;
Actual output (pyrseas 0.8.0):
schema public:
...
function f(value foo, c integer):
language: sql
owner: postgres
returns: bar
source: '
SELECT ROW(value.*)::bar;
'
owner: postgres
privileges:
- PUBLIC:
- all
- postgres:
- all
...
Desired output:
schema public:
...
function f(public.foo, integer):
arguments:
- name: value
schema: public
table: foo
- name: c
type: integer
returns:
- schema: public
table: bar
language: sql
owner: postgres
returns: bar
source: '
SELECT ROW(value.*)::bar;
'
owner: postgres
privileges:
- PUBLIC:
- all
- postgres:
- all
...
or something like that
I suppose my team can prepare patch for that. I need to know, it's possible to merge it to upstream (in theory) or we should consider forking. Thank you.
Currently the info on the arguments is from pg_get_function_identity_arguments
or pg_get_function_arguments
, which we try to parse in a less than satisfactory way. The information is available in pg_proc.proargtypes
as an oidvector, so if you query the catalog for function f
above, you'll get something like 214577 23
. You can get human-understandable types by issuing a query such as SELECT 214577::regtype, 23::regtype;
which will return the tuple public.foo, integer
but you'd want to retrieve this info together with the main query for all functions, not individually for each function (particularly if you have over 1500 functions as mentioned in your previous issue).
While working on r0.8, I experimented with int2vectors (present in pg_index
and similar to oidvectors) and ARRAY functions to see if I could get index column/expression info without having to parse the output of pg_get_expr
, but couldn't find a suitable solution. Admittedly, dealing with the proargtypes
oidvector may be easier.
From the viewpoint of Pyrseas goals, in general we only want to add to the YAML output info that is needed or useful for SQL generation by yamltodb
. Usually, the YAML parallels the output from pg_dump -s
. It doesn't seem that explicitly separating datatype info for each argument would be useful, since the info is present in the current function signature (arguably, with less detail for record types). However, if a change could be made to show this info, without any significant performance degradation, and without impacting yamltodb
, I would be amenable to merging it.
There is one thing that appears to need correction in any case. Function f
depends on table foo
(technically on the type foo
) but our current YAML does not indicate this. Under certain circumstances, presumably f
could be created before foo
and that would cause a PG error. @dvarrazzo, do you agree?
Currently the info on the arguments is from pg_get_function_identity_arguments or pg_get_function_arguments, which we try to parse in a less than satisfactory way. The information is available in pg_proc.proargtypes as an oidvector, so if you query the catalog for function f above, you'll get something like 214577 23. You can get human-understandable types by issuing a query such as SELECT 214577::regtype, 23::regtype; which will return the tuple public.foo, integer but you'd want to retrieve this info together with the main query for all functions, not individually for each function (particularly if you have over 1500 functions as mentioned in your previous issue).
I think it would be way easier to fetch the map oid -> (schema, name) for all the types in the db and use it to lookup in python, instead of trying to expand it in the query. Maybe this would be poor as a general method, but pg_proc
has a lot of subtleties/inconsistencies (see proargtypes
, proallargtypes
in the docs) that makes just easier to throw a couple of ifs in Python.
There is one thing that appears to need correction in any case. Function f depends on table foo (technically on the type foo) but our current YAML does not indicate this. Under certain circumstances, presumably f could be created before foo and that would cause a PG error. @dvarrazzo, do you agree?
pg_depend
contains dependencies between functions and types so topological sort should make sure types are created before the functions. I'm afraid I can't remember how we make use of this information in yaml though...
There is one thing that appears to need correction in any case. Function f depends on table foo (technically on the type foo) but our current YAML does not indicate this. Under certain circumstances, presumably f could be created before foo and that would cause a PG error.
For me to understand the process correctly - wouldn't the same apply to type bar
? When I try to follow the yaml generation - it does output depends_on
for anything that is not inferable here. The function argument types are inferred here and the return type here so these not being listed in the yaml seams to be expected?
Thank you. Seems like we need to parse arguments definition, it would be enough But dependencies bug is crucial.
Daniele, because of the dependency cycle issue that I had recently while working on issue #176, as you know I was revisiting Post-Facto and they did have caches of "oft-referenced maps": languages, schemas (and reverse map), types, PG classes, operators, functions and pg_catalog relnames. Those would be useful, but to really take advantage of them would require quite a bit of refactoring too.
Tobias, yes, you're right: function f
should also depend on table bar
(since currently tables and row types based on those tables are treated equivalently).
@excavador If I understand correctly, you and your team will deal with parsing arguments (or have dones so already) and issue is only open to address the apparently missing dependency that was pointed by @tbussmann. In case you haven't seen it already, please take a look at https://pyrseas.wordpress.com/2018/09/12/the-future-of-pyrseas-revisited/ . It would help if someone could construct a simple test to demonstrate the problem.