sqlx icon indicating copy to clipboard operation
sqlx copied to clipboard

Refactor Postgres types, guarantee resolution before usage

Open demurgos opened this issue 2 years ago • 3 comments

I am sharing my progress while refactoring the Postgres types (see https://github.com/launchbadge/sqlx/issues/1797).

The main idea is to split PgType as it's too general. The new model has the following hierarchy:

  • PgBuiltinType: Any type from the default catalog, natively known by SQLx
  • PgType: A fully resolved type, corresponds to builtin types + resolved custom types
  • PgLazyType: A type that may not be resolved yet, corresponds to fully resolved types + DeclaredBy{Name|Oid}

I also started implementing the TypeRegistry struct to replace the ad-hoc per-connection cache. Since it contains all resolved types, it should provide PgType known to be fully resolved (including subtypes).

demurgos avatar Apr 15 '22 17:04 demurgos

I was able to advance on this PR and the hardest part should be done.

The PgTypeRegistry struct keeps track of the current state of all the known types. In particular, it knows if a type is fully resolved (i.e. including all its transitive dependencies). This is achieved by implementing a depth-first search through the dependencies. This search is represented with the struct PgTypeResolution and can be paused and resumed (it follows the nightly Generator trait).

This aspect is important: it allows to insert types in any order into the registry. When a type is added, we create a PgTypeResolution to check its dependencies. It runs until the check completes, or until we find a not-yet-resolved dependency. In this situation, the search is paused and stored in the registry. At a later point, when we get data about this missing dependency, we can resume the search from where we stopped.

This also means that at every point in time, we know which dependencies need to be resolved: this is what drives queries to the database. I still need to implement this part, but it should be relatively simple. I also need to add more tests and improve diagnostics.

Note: a previous version performed the checks when reading the type information, but the current design with generators allows to compute the check once at insertion. I expect way more reads than type insertions so this is the better trade-off.

demurgos avatar Apr 21 '22 21:04 demurgos

I started the work in my PR by focusing on types. However, by working on it, it seems that the core of issue is synchronization of the catalog between the local program and remote database. Caching and analyzing type information is only part of it.

Following this, I am renaming the main struct from TypeRegistry to LocalPgCatalog: this is more general. The idea is that it may be extended in the future, for example to cache namespaces (it may required to improve type resolution by name).

demurgos avatar Apr 24 '22 18:04 demurgos

Some updates: I added checks to detect inconsistent DB responses. For example when a type kind changed or when the status present/missing changed. In practice we cache everything forever and only fetch each type once so these errors should not happen. But it still provide nice sanity checks for non-hot paths.

I also started work on integrating my changes into the rest of SQLx by replacing the fields cache_type_info and cache_type_oid from PgConnection. The biggest challenge was that my API returned views for resolved types, using a shared lifetime. However, most of SQLx assumes that types are static. Changing this assumption is out-of-scope here (but should probably be explored), so I added a new API based on Arc and Weak. The types are rooted in the catalog, and internal references use Weak to avoid memory leaks.

My current challenge regarding the integration with SQLx is somewhat related to the previous one. There is a single TypeInfo associated type, without any distinction between fetched and declared types. It makes it hard to staticly track if the type is resolved or not. I think I'll add a new associated type to represent "resolved" types (fetched with all transitive dependencies).

demurgos avatar May 01 '22 18:05 demurgos

@demurgos it's been a while, do you have any interest in finishing this? We can earmark it for 0.7.0 so you can make breaking changes if necessary.

abonander avatar Sep 03 '22 00:09 abonander