sqlx icon indicating copy to clipboard operation
sqlx copied to clipboard

Proposal: adding unsigned integer support to sqlx postgres behind a feature flag

Open pendruct opened this issue 2 years ago • 5 comments

I understand that Postgres does not have support for unsigned integers, but I still think that a very strong case could be made for creating support for unsigned integers in sqlx. Here are my reasons:

  1. Some users postgres may already support unsigned Postgres integers via this popular extension, so it would be nice to allow them to have support in sqlx as well.

  2. In my case, all of my data in my app is unsigned. That is the proper type for all of my data, so my Rust types reflect that. It is very unfortunate that I have to currently do hundreds of as casts throughout my code-base and create cumbersome mapping functions just to insert unsigned values into my database, and receive unsigned values back.

  3. It is a very common use-case to create a web server that reads some rows from a database and then returns that to the user. Most of the time this is done with JSON, but it is often the case the client knows the format of the returned data, and a sort of protocol buffer can be used instead (like Flatbuffers, Protocol Buffers, Cap'n Proto, etc.) to massively improve efficiency. In these cases, signed integers often have to perform zig-zag encoding first, and then the client would have to decode the zig-zag encoding. If that vast majority of your data is actually unsigned, then having to write them all as signed integers and perform this costly zigzag encoding/decoding step is pointless.

  4. There are many people asking for unsigned integer support, so there is a clear desire.

  5. This doesn't have to be made available by default. It could be gated behind a feature flag which would have a disclaimer on the risks you face when opting in to such support.

  6. I heavily desire this support, so currently I have just downloaded sqlx, manually patched in support myself, and am using a local copy. I wouldn't be surprised if others have done this as well.

  7. It does not seem difficult to add this to sqlx. I'm not too familiar with the library, but as far as I can tell the only changes I had to make to add support for u32 and u64 were the following:

  • Navigate to sqlx\sqlx-core\src\postgres\types\int.rs and add the following:
impl Type<Postgres> for u64 {
  fn type_info() -> PgTypeInfo {
      PgTypeInfo::INT8
  }
}

impl PgHasArrayType for u64 {
  fn array_type_info() -> PgTypeInfo {
      PgTypeInfo::INT8_ARRAY
  }
}

impl Encode<'_, Postgres> for u64 {
  fn encode_by_ref(&self, buf: &mut PgArgumentBuffer) -> IsNull {
      buf.extend(&self.to_be_bytes());

      IsNull::No
  }
}

impl Decode<'_, Postgres> for u64 {
  fn decode(value: PgValueRef<'_>) -> Result<Self, BoxDynError> {
      Ok(match value.format() {
          PgValueFormat::Binary => BigEndian::read_u64(value.as_bytes()?),
          PgValueFormat::Text => value.as_str()?.parse()?,
      })
  }
}

// ---

impl Type<Postgres> for u32 {
  fn type_info() -> PgTypeInfo {
      PgTypeInfo::INT4
  }
}

impl PgHasArrayType for u32 {
  fn array_type_info() -> PgTypeInfo {
      PgTypeInfo::INT4_ARRAY
  }
}

impl Encode<'_, Postgres> for u32 {
  fn encode_by_ref(&self, buf: &mut PgArgumentBuffer) -> IsNull {
      buf.extend(&self.to_be_bytes());

      IsNull::No
  }
}

impl Decode<'_, Postgres> for u32 {
  fn decode(value: PgValueRef<'_>) -> Result<Self, BoxDynError> {
      Ok(match value.format() {
          PgValueFormat::Binary => BigEndian::read_u32(value.as_bytes()?),
          PgValueFormat::Text => value.as_str()?.parse()?,
      })
  }
}
  • Navigate to sqlx\sqlx-core\src\any\types.rs and add the following:
impl_any_type!(u32);
impl_any_type!(u64);

impl_any_encode!(u32);
impl_any_encode!(u64);

impl_any_decode!(u32);
impl_any_decode!(u64);

To summarize:

  • some users already have unsigned support in Postgres, so it would be nice to have sqlx support there
  • forcing usage of signed integers makes Rust code less accurate and forces the creation of many cumbersome mapping functions
  • forcing usage of zigzag encoding for serializing positive numbers is very wasteful (or forces the creation of many duplicate structs and mapping functions)
  • users currently have to download the crate and manually patch the changes in themselves, which is sub-optimal
  • users can opt-in to the feature so they understand the risks
  • it seems easy to add as a feature

pendruct avatar Feb 10 '23 12:02 pendruct

The reason we don't support just bit-casting unsigned integers in Rust to signed integers in SQL (this applies to SQLite as well, which has the same problem) is that any attempt to manipulate them in SQL can result in unexpected behavior.

For example, if we just send a 32768u16 in two's complement and tell the SQL server it's a SHORT, the SQL server will interpret that as -32768. Sure, just reading that back to a u16 in Rust will produce the correct value again, but the value won't be semantically correct in SQL. We try not to straight-up lie to the user like that. The query macros would also have no idea that the value is supposed to be interpreted as unsigned.

You could widen to the next-larger integer type so it can represent the full range properly, but what type would you widen u64 to? NUMERIC?

I'd accept a PR adding support for the extension behind a feature flag, but there's a few caveats:

  • It needs to be very clear in the documentation that use of unsigned integer types requires installing the extension; this is likely to be a common pain point anyway because users in our experience don't thoroughly read the documentation before trying something, and then open an issue when things don't function as expected.
  • We have to reference the extension types by name, not OID. While OIDs of types in the standard catalog are de-facto stable, OIDs of types loaded by extension depend on the exact order the extensions are loaded in. We do support referencing types by-name in the Postgres driver but there's some footguns that I can't remember right now.
  • The pguint extension doesn't define send/recv procedures for its types which means we can't use binary encoding for them, but that's what SQLx was designed around. The Postgres driver would need to know to specify text format instead of binary format for the relevant parameters/columns when creating the query portal.
  • And to throw an extra wrench into the works, the text format is locale dependent! So we'd have to check the set locale and probably error unless we know that it produces a format we can parse and output to, unless you want to write parsers and formatters for all the locales that Postgres supports. (Honestly, I'd consider the requirement of text format a hard blocker; it'd just be easier to get the extension to add binary send/recv procs... if it's even being maintained.)

abonander avatar Mar 03 '23 00:03 abonander

I came here wanting the same thing as @pendruct. But I must admit that @abonander convinced me with this part:

Sure, just reading that back to a u16 in Rust will produce the correct value again, but the value won't be semantically correct in SQL. We try not to straight-up lie to the user like that.

xpe avatar Dec 26 '23 13:12 xpe

  • The pguint extension doesn't define send/recv procedures for its types which means we can't use binary encoding for them

The more feature-full https://github.com/pg-uint/pg-uint128 does support binary send/recv so it should probably be looked at as the main option if you were to add a feature flag for a extension.

Coca162 avatar Jun 16 '25 07:06 Coca162

I'll share my experience here in hope it's useful to sqlx users. Most of our sqlx projects used to use u16, u32 and u64 on the Rust side, with the values either getting casted or converted via .try_into() before exchanging them with sqlx. On the PostgreSQL side we'd then have check constraints to make sure that the values stored in the table were positive integers. This approach worked for some time, but after a while all of these workarounds made us feel like we weren't really benefiting from Rust's robustness anymore. Given that this approach was propagated to the entire application, it even appeared in the public API when we adopted OpenAPI via utoipa.

While working on one of our latest projects, which didn't use sqlx but that's a different story, we decided to finally properly fix this and released benzina. One of the features of benzina are the U15, U31 and U63 types. Those types are u16, u32 and u64 internally, but the constructors and all APIs enforce the value to be in range of 0..=$inner_type_but_signed::MAX. This is then also propagated to the serde Deserialize implementation, making out of range values fail deserialization, just as it would fail if someone passed val > u32::MAX to u32. It doesn't stop there: the OpenAPI spec reflects the correct allowed range, and also the type does not implement Add, AddSigned or any of the other silently wrapping math traits, just like NonZero, because wrapping is not what you want most of the time.

I haven't kept up with all of the recent sqlx development lately, but while it can be extremely convenient at first to implement conversions for unsigned Rust types, in the long run I feel like it will turn out to have been a big mistake, and a proper solution that is worth of Rust's values should instead be the goal.

paolobarbolini avatar Jun 16 '25 07:06 paolobarbolini

in the meantime, this crate seems to do the job: https://github.com/bitfl0wer/sqlx-pg-uint

FrenchGithubUser avatar Oct 17 '25 14:10 FrenchGithubUser