scenic
scenic copied to clipboard
Add topological sorting for dumped views via TSort
Rebase of https://github.com/scenic-views/scenic/pull/337
Hello! We (me and @thorncp) have been trying this code, but it hasn't worked for us. Here's exactly what we tried:
- pin scenic to
1431c49ec6d6063212aa718db73cc6158faa1461
- create view
as
- migrate
- This added
as
view at the top of the views in db/schema (alphabetical sort) - create view
bs
- migrate
- added
bs
view afteras
in the db/schema (alphabetical sort) -
replace_view
as
withselect * from bs
to generate dependent order- We also tried it with
update_view
and got the same outcome
- We also tried it with
- migrate
-
as
replaced the view and did not move it belowbs
At this point, our schema can't be loaded into the database, because it will try to create as
using bs
but bs
does not yet exist.
rails db:schema:load
ActiveRecord::StatementInvalid: PG::UndefinedTable: ERROR: relation "bs" does not exist
LINE 2: FROM bs;
^
/path/to/db/schema.rb:4416:in `block in <main>'
/path/to/db/schema.rb:13:in `<main>'
Caused by:
PG::UndefinedTable: ERROR: relation "bs" does not exist
LINE 2: FROM bs;
^
We also tried this with our existing views. Before we dumped the schema they were in a valid, topological order. After we dumped, they were in an invalid non-topological order.
It this expected? As it stands, our app would break if this got merged.
This is not expected. The idea of this PR is to provide both correct ordering (which we have today) and stable ordering (which we do not).
Your manual steps are a good test case for us to consider including in our suite today (that bs
must come before as
) to prevent regression. I'm a bit surprised we don't have something like that already though...
Thanks for the confirmation.
FWIW, I just wrote a test against this commit and it passes:
it "sorts dependency order over alphabetical order" do
conn.create_view "as", sql_definition: <<-SQL
SELECT 1;
SQL
conn.create_view "bs", sql_definition: <<-SQL
SELECT 2;
SQL
conn.update_view "as", sql_definition: <<-SQL
SELECT * FROM bs;
SQL
expect(views).to eq(%w[bs as])
end
I also tried conn.replace_view
but that doesn't take a sql_definition
and I didn't research further.
Maybe it's our setup somehow? Maybe there's some untested behavior if you migrate in between creating and updating dependent views?
Update: I tried this on a brand new app and it did put bs
before as
. So I'm thinking it has something to do with our particular app. Hopefully I'll get some more info for you.
We (@heatherbooker and I) figured it out today. Our database views are not in the public
schema, they exist in a different schema set via the config/database.yml
that looks something like this:
production:
database: app_production
schema_search_path: "abc"
The SQL that generates the list of dependent views does not include the schema name, but the code to generate view names does include the non-public
schema name.
This breaks the logic that prepares the views to be TSorted.
We updated the select portion of the SQL and now our example case passes and puts as
below bs
:
- SELECT distinct dependent_ns.nspname AS dependent_schema
- , dependent_view.relname AS dependent_view
- , source_ns.nspname AS source_schema
- , source_table.relname AS source_table
+ SELECT DISTINCT
+ dependent_ns.nspname AS dependent_schema,
+ CASE
+ WHEN dependent_ns.nspname = 'public' THEN dependent_view.relname
+ ELSE (dependent_ns.nspname || '.' || dependent_view.relname)
+ END AS dependent_view,
+ source_ns.nspname AS source_schema,
+ CASE
+ WHEN source_ns.nspname = 'public' THEN source_table.relname
+ ELSE (source_ns.nspname || '.' || source_table.relname)
+ END AS source_table
FROM pg_depend
JOIN pg_rewrite ON pg_depend.objid = pg_rewrite.oid
JOIN pg_class as dependent_view ON pg_rewrite.ev_class = dependent_view.oid
JOIN pg_class as source_table ON pg_depend.refobjid = source_table.oid
JOIN pg_namespace dependent_ns ON dependent_ns.oid = dependent_view.relnamespace
JOIN pg_namespace source_ns ON source_ns.oid = source_table.relnamespace
WHERE dependent_ns.nspname = ANY (current_schemas(false)) AND source_ns.nspname = ANY (current_schemas(false))
AND source_table.relname != dependent_view.relname
AND source_table.relkind IN ('m', 'v') AND dependent_view.relkind IN ('m', 'v')
- ORDER BY dependent_view.relname;
+ ORDER BY dependent_view;
This should replicate the current scenic logic.
I also created a test, but it's failing when the new test just before it (the test added in this PR on line 152) runs before it. Not sure why:
it "sorts dependency order when views exist in a schema" do
conn.execute("CREATE SCHEMA IF NOT EXISTS scenic; SET search_path TO scenic")
conn.execute("CREATE VIEW apples AS SELECT 1;")
conn.execute("CREATE VIEW bananas AS SELECT 2;")
conn.execute("CREATE OR REPLACE VIEW apples AS SELECT * FROM bananas;")
expect(views).to eq(%w[scenic.bananas scenic.apples])
conn.execute("DROP SCHEMA IF EXISTS scenic CASCADE;")
end
At my jobby job we've been using this since 10/26 and have made several updates to views (though none that change the dependency structure) and have not seen any unnecessary reordering of views, so I'm inclined after ~2.5 months to call this a win.
@edwardloveall I wonder if #401 helps with your use case at all? Also if you want to write up a PR to add specs that exercise view ordering I'd appreciate it.
It's good enough for me. Merge when ready.
When we had talked about this problem originally (years ago), we said we would make this a "breaking change" for 2.0 since it would likely change everyone's generated structure.sql once. Do you still feel that's appropriate?
Unclear - I think we'd re-order the schema.rb one more time whenever it's next generated (with a view change?) which has already been happening, then it should stop. That may not be a breaking change.
@edwardloveall I wonder if #401 helps with your use case at all?
It does, but only in combination with the changes here.
Also if you want to write up a PR to add specs that exercise view ordering I'd appreciate it.
Done: https://github.com/scenic-views/scenic/pull/403
I'm seeing a failure on one of the specs here after @edwardloveall's PR adding sort order specs was merged. I'm fairly sure that the spec is correct and I'm not sure why the failure is happening.
I'm seeing a failure on one of the specs here after @edwardloveall's PR adding sort order specs was merged. I'm fairly sure that the spec is correct and I'm not sure why the failure is happening.
Looking at the branch, it's similar to my previous comment: the SQL that generates the dependent views does not contain the schema, but Scenic's views do. tsorted_views
then sorts the non-schema-qualified views correctly and ignores the schema-qualified views. Then the non-schema-qualified views get filtered out.
Here's one possible fix:
def tsorted_views(views_names)
views_hash = TSortableHash.new
::Scenic.database.execute(DEPENDENT_SQL).each do |relation|
- source_v = relation["source_table"]
- dependent = relation["dependent_view"]
+ source_v = [relation["source_schema"], relation["source_table"]].join(".")
+ dependent = [relation["dependent_schema"], relation["dependent_view"]].join(".")
views_hash[dependent] ||= []
views_hash[source_v] ||= []
views_hash[dependent] << source_v
views_names.delete(source_v)
views_names.delete(dependent)
end
# after dependencies, there might be some views left
# that don't have any dependencies
views_names.sort.each { |v| views_hash[v] = [] }
binding.irb
views_hash.tsort
end
I'm seeing a failure on one of the specs here after @edwardloveall's PR adding sort order specs was merged. I'm fairly sure that the spec is correct and I'm not sure why the failure is happening.
Looking at the branch, it's similar to my previous comment: the SQL that generates the dependent views does not contain the schema, but Scenic's views do.
tsorted_views
then sorts the non-schema-qualified views correctly and ignores the schema-qualified views. Then the non-schema-qualified views get filtered out.Here's one possible fix:
def tsorted_views(views_names) views_hash = TSortableHash.new ::Scenic.database.execute(DEPENDENT_SQL).each do |relation| - source_v = relation["source_table"] - dependent = relation["dependent_view"] + source_v = [relation["source_schema"], relation["source_table"]].join(".") + dependent = [relation["dependent_schema"], relation["dependent_view"]].join(".") views_hash[dependent] ||= [] views_hash[source_v] ||= [] views_hash[dependent] << source_v views_names.delete(source_v) views_names.delete(dependent) end # after dependencies, there might be some views left # that don't have any dependencies views_names.sort.each { |v| views_hash[v] = [] } binding.irb views_hash.tsort end
Guess my comment on original PR was lost, mentioned in here https://github.com/scenic-views/scenic/pull/337#pullrequestreview-1231488975, we have been running with those changes since then without issues in apps that use cross schema views.
Superseded by https://github.com/scenic-views/scenic/pull/416