graphql-engine
graphql-engine copied to clipboard
Server/CLI: Incorrect `replace_metadata` API behaviour
Version Information
Server Version: v2.10.1 CLI Version (for CLI related issue): v2.10.1
Environment
OSS/Cloud/EE
What is the current behaviour?
Let's take a look at the following scenario
Note: For simplicity, in the examples below I'm using hasura CLI to apply metadata which uses replace_metadata
API under the hood
I have a new HGE instance running. I'm applying the following metadata to the server.
{
"version": 3,
"sources": [
{
"name": "default",
"kind": "postgres",
"tables": [
{
"table": {
"name": "t1",
"schema": "public"
}
},
{
"table": {
"name": "t2",
"schema": "public"
}
}
],
"configuration": {
"connection_info": {
"database_url": {
"from_env": "HASURA_GRAPHQL_DATABASE_URL"
},
"isolation_level": "read-committed",
"pool_settings": {
"connection_lifetime": 600,
"idle_timeout": 180,
"max_connections": 50,
"retries": 1
},
"use_prepared_statements": true
}
}
}
]
}
As expected our server now has inconsistencies (shown below) because tables t1 and t2 does not exist yet. So, I go ahead and apply migrations which create these tables.
$ hasura md ic list 130 ↵
NAME TYPE DESCRIPTION REASON
t2 table {"name":"t2","schema":"public"}... Inconsistent object: no such
table/view exists in source:
"t2"
t1 table {"name":"t1","schema":"public"}... Inconsistent object: no such
table/view exists in source:
"t1"
$ hasura migrate apply --all-databases
INFO migrations applied on database: default
Now I try to apply metadata again. Pause for a second here and think about what might be the expected result of this operation.
$ hasura metadata apply
WARN Metadata is inconsistent
WARN Use 'hasura metadata ic list' command to list inconsistent objects
INFO Metadata applied
$ hasura md ic list
NAME TYPE DESCRIPTION REASON
t2 table {"name":"t2","schema":"public"}... Inconsistent object: no such
table/view exists in source:
"t2"
t1 table {"name":"t1","schema":"public"}... Inconsistent object: no such
table/view exists in source:
"t1"
But as we see above, the metadata is inconsistent.
What is the expected behaviour?
w.rt to the example above, After resolving inconsistencies ie creating relevant tables, metadata apply
should not result in inconsistencies.
How to reproduce the issue?
Refer description above
Keywords
replace_metadata
, hasura metadata apply
@manasag pretty sure this is product platform team, either the CLI or the metadata on the server.
This bug is accepted and has been kept in the Platform team backlog.
I've been digging into this and it seems quite clear that this is caused by a failure to invalidate the relevant parts of the schema cache upon replacing the metadata.
This seems intentional. I'm happy to change the behavior so that the schema cache is invalidated upon replacing the metadata, but before I do that, it'd be good to know that we are happy with this change, and that there isn't a good reason why we should keep it the way it is.
@manasag, what do you think?
@abooij, do you know why the replace_metadata
endpoint doesn't force a reload of existing source data?
I've been digging into this and it seems quite clear that this is caused by a failure to invalidate the relevant parts of the schema cache upon replacing the metadata.
This seems intentional. I'm happy to change the behavior so that the schema cache is invalidated upon replacing the metadata, but before I do that, it'd be good to know that we are happy with this change, and that there isn't a good reason why we should keep it the way it is.
@manasag, what do you think?
@abooij, do you know why the
replace_metadata
endpoint doesn't force a reload of existing source data?
The inconsistencies raised by migrations normally get resolved after the final hasura metadata apply
. So what causes this to normally work as expected, but fail in the case reported in this issue?
Depending on the answer, I'm inclined to say that this is an oversight. I was not involved in the original development of replace_metadata
- @SamirTalwar have you done any archeological work in this regard to dig up the design decisions involved?
At some point there was an idea to have the Console be fully powered by the replace_metadata
API, rather than using the incremental metadata API. In that case, it would be important not to invalidate the schema cache fully every time.
@0x777 might also have some context here.
I tried digging into the thought process behind replace_metadata
but didn't find too much; I think tacit knowledge is going to get us further than Git history in this case.
The inconsistencies raised by migrations normally get resolved after the final
hasura metadata apply
. So what causes this to normally work as expected, but fail in the case reported in this issue?
In this case, the inconsistencies are already present, which I think is different from the case you're describing, in which the migration introduces an inconsistency. Do I have this right?
Are you saying the migration step should automatically reload the schema, or is it more nuanced than that?
At some point there was an idea to have the Console be fully powered by the replace_metadata API, rather than using the incremental metadata API. In that case, it would be important not to invalidate the schema cache fully every time.
That's a good point. However, between the two options, I think ignoring the previous state and 'applying the metadata from scratch' seems less surprising than our current behaviour. @rikinsk What are your thoughts?
The inconsistencies raised by migrations normally get resolved after the final hasura metadata apply
Interesting. I have not considered this at all. So why isn't the schema cache being rebuilt after the last pg_run_sql
, which would've resolved the inconsistencies?
queryModifiesSchema :: RQLQuery -> Bool
queryModifiesSchema = \case
RQInsert _ -> False
RQSelect _ -> False
RQUpdate _ -> False
RQDelete _ -> False
RQCount _ -> False
RQRunSql q -> Postgres.isSchemaCacheBuildRequiredRunSQL q
RQCitusRunSql q -> Postgres.isSchemaCacheBuildRequiredRunSQL q
RQCockroachRunSql q -> Postgres.isSchemaCacheBuildRequiredRunSQL q
RQMssqlRunSql q -> MSSQL.isSchemaCacheBuildRequiredRunSQL q
RQMysqlRunSql _ -> False
RQBigqueryRunSql _ -> False
RQBigqueryDatabaseInspection _ -> False
RQBulk l -> any queryModifiesSchema l
Postgres.isSchemaCacheBuildRequiredRunSQL :: RunSQL -> Bool
Postgres.isSchemaCacheBuildRequiredRunSQL RunSQL {..} =
case rTxAccessMode of
Q.ReadOnly -> False
Q.ReadWrite -> fromMaybe (containsDDLKeyword rSql) rCheckMetadataConsistency
where
containsDDLKeyword =
TDFA.match
$$( quoteRegex
TDFA.defaultCompOpt
{ TDFA.caseSensitive = False,
TDFA.multiline = True,
TDFA.lastStarGreedy = True
}
TDFA.defaultExecOpt
{ TDFA.captureGroups = False
}
"\\balter\\b|\\bdrop\\b|\\breplace\\b|\\bcreate function\\b|\\bcomment on\\b"
)
We seem to return false
when the SQL statement is something along create table(..)
. If I remember correctly, this design was meant to prevent a consistent object becoming inconsistent. Hence, we only look for the keywords alter
, replace
etc, i.e, statements which could potentially change the behaviour of a metadata object (create function
I think exists because you can create overloaded functions and that causes function objects in metadata to be come inconsistent). However we never really accounted for the transition of a metadata object from inconsistent to consistent. Maybe we should take a look at this again.
If I remember correctly, this design was meant to prevent a consistent object becoming inconsistent.
In what kind of scenario would a create table
cause this to happen?
I think there are two issues here.
The first is the one pointed out by @scriptonist. I think that it can be summarized as follows: replace_metadata
should have the effect of also reloading the metadata. Exporting the metadata, and then immediately replacing the metadata with the exported metadata (i.e. a no-op) should have the effect of reloading the metadata.
replace_metadata . export_metadata === reload_metadata
This is a departure from the current behavior, but IMO a good one, as I think it will be less surprising for the average user.
The second issue is the one highlighted by @0x777: a CREATE TABLE
query, run through the run_sql
command, can have the effect of making currently-inconsistent metadata consistent. This means that we probably need to invalidate the cache for CREATE
alongside all the other keywords we recognize as "DDL". If we had this behavior, @scriptonist would not need to reload (or reapply) the metadata at all.
I think this is important, but there's nothing preventing @scriptonist from modifying the table using psql
or another migration framework instead of hasura migrate
. I therefore don't consider it a fix to this issue, though I do think it's worth fixing.
(@scriptonist, please correct me if I've misinterpreted you in any way.)
@0x777, @manasag, what do you think? Do we fix both of these? Only one of them?
replace_metadata should have the effect of also reloading the metadata
This means that we probably need to invalidate the cache for CREATE alongside all the other keywords we recognize as "DDL". If we had this behavior, @scriptonist would not need to reload (or reapply) the metadata at all.
I think this is important, but there's nothing preventing @scriptonist from modifying the table using psql or another migration framework instead of hasura migrate. I therefore don't consider it a fix to this issue, though I do think it's worth fixing
@SamirTalwar I agree with all of these. Let's change the replace_metadata
behaviour in one PR and then create an issue for 'create table' turning inconsistent objects into consistent and add it to our backlog.
If I remember correctly, this design was meant to prevent a consistent object becoming inconsistent.
In what kind of scenario would a create table cause this to happen?
It wouldn't and hence we don't re-build schema cache when we see create table
like statements. Note that there is no regular expression match for create
or create table
in isSchemaCacheBuildRequiredRunSQL
.
@scriptonist, I have resolved this in https://github.com/hasura/graphql-engine/commit/93e8803d3e1cc4650229f3764743697abefc2665. happy to close this out now?