spin icon indicating copy to clipboard operation
spin copied to clipboard

feat(spin_sdk::pg): Extract type conversions into sdk

Open etehtsea opened this issue 3 years ago • 4 comments

This has to be the starting point on the way to providing the API that is able to automatically map a DB response into the user-defined struct.

Signed-off-by: Konstantin Shabanov [email protected]

etehtsea avatar Oct 11 '22 16:10 etehtsea

@itowlson I've added decoding support for the following DbValues so as their NULLable variants:

  • boolean(bool)
  • int16(s16)
  • int32(s32)
  • int64(s64)
  • floating32(float32)
  • floating64(float64)
  • str(string)
  • binary(list<u8>)

Initially, I wanted to review/test types mapping inside the outbound-pg crate first because I've got a few concerns:

  • convert_data_type doesn't seem to handle all types available in the wit (at least INT2 seems to be missing) (https://github.com/fermyon/spin/blob/main/crates/outbound-pg/src/lib.rs#L133-L149);
  • tokio-postgres maps several pg types into one rust type (https://docs.rs/postgres-types/latest/postgres_types/trait.FromSql.html#types) but convert_entry maps only some types from the list (e.g. "char" is missing);

Also, I want to share some other thoughts which are out of the scope of this PR but are related to:

  • https://github.com/fermyon/spin/pull/777
  • https://github.com/fermyon/spin/issues/576
  • https://github.com/fermyon/spin-dotnet-sdk/tree/main/wit/ephemeral
  • Regarding other SQL DBs support: rather than adding support for different SQL DBs using various crates from different maintainers wouldn't it be beneficial to utilize sqlx for this matter? Originally, I was thinking that sqlx is just the layer above different DB drivers, but during my experiments last week I realized that this is not the case. I experimented with it last week and the code seems quite similar (but there are some drawbacks like https://github.com/launchbadge/sqlx/issues/1369). Besides checking the sqlx itself another goal was to check the ability to get information about the column's NOT NULL constraint from the column info. Unfortunately, there is no way to do this for dynamic SQL queries (at least in PG).
  • Regarding WIT format: at this moment I'm not sure that "generic" types (DbValue) are convenient because:
    • what to do with db-specific types?
    • there is no way to figure out the original (DB) type. For example, how to (de)serialize timestamps? They could be encoded as DbValue::Str(RFC 3339/ISO 8601) but this way it's not clear how to deserialize them. So it makes sense to add a separate enum variant for them. Another option might be to store original (DB) type in the column info.
    • current DbValue's format seems to be modeled after Rust's types but different languages might approach DB types in a different way (https://pkg.go.dev/github.com/lib/pq#hdr-Data_Types) All in all, it might be easier to introduce different mappings for different SQL DBs that are close to their types.
  • At this moment, it's not clear how to handle breaking WIT format changes (https://github.com/fermyon/spin-dotnet-sdk/tree/main/wit/ephemeral). Any change to the format will break C# SDK. When it gets more stable, it wouldn't be a big problem, but for now, it might become an issue.

etehtsea avatar Oct 12 '22 17:10 etehtsea

Thanks for the great feedback. sqlx certainly looks like it might be a great four-for-the-price-of-one solution - I didn't know about it before so thanks for the pointer!

I agree we need a better story for DB-specific types and generally on how to represent DB-specific operations and options. There's a reason every database gets its own library and we can't afford to smooth that over too much. One option longer term is to not even have database WIT specs, and provide infrastructure whereby source languages can use their own database crates / packages / modules / etc. But that's hard to do at the moment because most of them sit over raw sockets, and that sockets code doesn't yet compile to Wasm.

The current situation is a sighting shot at what we can do in the meantime, and there's certainly a good case for moving the data types out of a common WIT and into database-specific ones. But at the WIT level we'd probably still end up with a big honkin' variant, which would require languages like Go and C# to do some manual unpicking to make it idiomatic. As noted in the MySQL POC, there are other things we should look at when we approach this, such as whether to go for a more resource-oriented view of connections and transactions and so on. In any case, we're very happy to iterate on this and we'd welcome your (and other contributors') thoughts and advice.

Regarding how we evolve WIT files - this (and contract versioning more generally) is a big concern of mine, and folks within Fermyon are actively debating when to stabilise various contracts and how to version them thereafter. We have muddled along so far by batching up breaking changes to limit the number of times we have to perform breaks, but we do need a more robust and systemic solution. Unfortunately, we don't yet know what that looks like, and it likely needs to align to broader Wasm Component Model standards which are still in flux. Sorry I can't give you a more satisfying answer for now, but it's very much on our radar.

Thanks again for the detailed thoughts. This is a very early-stage area and it's so helpful to have wider conversations about it!

itowlson avatar Oct 12 '22 20:10 itowlson

such as whether to go for a more resource-oriented view of connections and transactions and so on

Yeah, I was looking into more resource-oriented approach but wit-bindgen is currently experiencing a big rewrite, so resource/handle support is dropped for now. https://github.com/bytecodealliance/wit-bindgen/commit/25596c799b3bbe85f2b5939e8538a524fa4efdf2 So it looks like the current approach is the only way for the short-term.

P.S. async support is also temporary dropped https://github.com/bytecodealliance/wit-bindgen/commit/f1682df9adacb515227d061a4601b2b4c997e7a2

etehtsea avatar Oct 13 '22 03:10 etehtsea

I've added basic integration test to test outbound-pg. It's currently excluded from the default integration test because it expects running postgresql server. The test itself is a bit awkward but I couldn't find other easy way to test it atm.

etehtsea avatar Oct 15 '22 18:10 etehtsea

UPD

Is there something else to cover?

int8(s8) support wasn't added into conversions because:

  • the only type that fits is "char" which is:

The type "char" (note the quotes) is different from char(1) in that it only uses one byte of storage, and therefore can store only a single ASCII character. It is used in the system catalogs as a simplistic enumeration type. https://www.postgresql.org/docs/current/datatype-character.html

  • "char" isn't supported in outbound-pg;

etehtsea avatar Oct 31 '22 10:10 etehtsea