sqlx icon indicating copy to clipboard operation
sqlx copied to clipboard

Left join and group by on nullable field returns runtime `UnexpectedNullError` error - PostgreSQL, query macros

Open BenJeau opened this issue 10 months ago • 5 comments

Bug Description

When doing a left join between two tables on a field that is nullable and performing a group by on the joined table field, a UnexpectedNullError is returned

Minimal Reproduction

The following is a simplified reproduction from my current project:

CREATE TABLE "providers" (
    "id" BIGINT PRIMARY KEY,
    "name" TEXT NOT NULL
);

CREATE TABLE "sources" (
    "id" BIGINT PRIMARY KEY,
    "provider_id" INTEGER REFERENCES "providers" ("id")
);

INSERT INTO "providers" ("name") VALUES ('github');
INSERT INTO "sources" ("provider_id") VALUES (1), (1), (NULL);

The following query within SQLx errors but should return valid data:

sqlx::query!(
        r#"SELECT count(*), providers.id, providers.name
FROM sources
LEFT JOIN providers ON providers.id = sources.provider_id
GROUP BY providers.id;"#
    )
    .fetch_all(pool)
    .await;

Value of fetch_all return:

Err(
    ColumnDecode {
        index: "1",
        source: UnexpectedNullError,
    },
)

Query data using an external DB tool:

[
  {
    "count": 1,
    "id": null,
    "name": null
  },
  {
    "count": 2,
    "id": 1,
    "name": "google"
  }
]

I alter the query to reference the original table only, everything works:

sqlx::query!(
        r#"SELECT count(*), sources.provider_id
FROM sources
LEFT JOIN providers ON providers.id = sources.provider_id
GROUP BY sources.provider_id;"#
    )
    .fetch_all(pool)
    .await;

Value of fetch_all return:

Ok(
    [
        Record {
            count: Some(
                1,
            ),
            provider_id: None,
        },
        Record {
            count: Some(
                2,
            ),
            provider_id: Some(
                1,
            ),
        },
    ]
)

But since I want the other field (name) from the other table, I cannot/do not want to do the last query.

Info

  • SQLx version: 0.7.4
  • SQLx features enabled: ["chrono", "json", "macros", "postgres", "runtime-tokio-native-tls", "uuid"]
  • Database server and version: PostgreSQL 16.2 (running within docker with the postgres:16.2-alpine3.19 image)
  • Operating system: MacOS 14.4.1
  • rustc --version: rustc 1.77.0 (aedd173a2 2024-03-17) (running within docker with the rust:1.77.0-slim-bookworm image)

BenJeau avatar Mar 31 '24 19:03 BenJeau

I have tested with 0.7.4, 0.7.0 and 0.6.3 and they all have this issue.

BenJeau avatar Mar 31 '24 19:03 BenJeau

Please provide the output of EXPLAIN(VERBOSE, FORMAT JSON) <query>

abonander avatar Mar 31 '24 19:03 abonander

I've also tried the query and query_as version that is not a macro and it is returning the data as expected with no errors.


Here's the query explained as requested:

[
  {
    "Plan": {
      "Node Type": "Aggregate",
      "Strategy": "Hashed",
      "Partial Mode": "Simple",
      "Parallel Aware": false,
      "Async Capable": false,
      "Startup Cost": 88.43,
      "Total Cost": 101.13,
      "Plan Rows": 1270,
      "Plan Width": 44,
      "Output": ["count(*)", "providers.id", "providers.name"],
      "Group Key": ["providers.id"],
      "Planned Partitions": 0,
      "Plans": [
        {
          "Node Type": "Hash Join",
          "Parent Relationship": "Outer",
          "Parallel Aware": false,
          "Async Capable": false,
          "Join Type": "Left",
          "Startup Cost": 38.58,
          "Total Cost": 77.13,
          "Plan Rows": 2260,
          "Plan Width": 36,
          "Output": ["providers.id", "providers.name"],
          "Inner Unique": true,
          "Hash Cond": "(sources.provider_id = providers.id)",
          "Plans": [
            {
              "Node Type": "Seq Scan",
              "Parent Relationship": "Outer",
              "Parallel Aware": false,
              "Async Capable": false,
              "Relation Name": "sources",
              "Schema": "public",
              "Alias": "sources",
              "Startup Cost": 0.00,
              "Total Cost": 32.60,
              "Plan Rows": 2260,
              "Plan Width": 4,
              "Output": ["sources.id", "sources.provider_id"]
            },
            {
              "Node Type": "Hash",
              "Parent Relationship": "Inner",
              "Parallel Aware": false,
              "Async Capable": false,
              "Startup Cost": 22.70,
              "Total Cost": 22.70,
              "Plan Rows": 1270,
              "Plan Width": 36,
              "Output": ["providers.id", "providers.name"],
              "Plans": [
                {
                  "Node Type": "Seq Scan",
                  "Parent Relationship": "Outer",
                  "Parallel Aware": false,
                  "Async Capable": false,
                  "Relation Name": "providers",
                  "Schema": "public",
                  "Alias": "providers",
                  "Startup Cost": 0.00,
                  "Total Cost": 22.70,
                  "Plan Rows": 1270,
                  "Plan Width": 36,
                  "Output": ["providers.id", "providers.name"]
                }
              ]
            }
          ]
        }
      ]
    }
  }
]

BenJeau avatar Mar 31 '24 20:03 BenJeau

I've run into this issue a couple times

spencerbart avatar Apr 01 '24 22:04 spencerbart

A column should be nullable during left joins that but the generated query-<hash>.json marks the column as nullable false and then I get the error ColumnDecode { index: "7", source: UnexpectedNullError }

spencerbart avatar Apr 05 '24 18:04 spencerbart