orchid-orm icon indicating copy to clipboard operation
orchid-orm copied to clipboard

Chain on many-to-many relation produces invalid SQL, sometimes

Open mordechaim opened this issue 10 months ago • 11 comments

For this error I'm experiencing I haven't been able to reproduce, but it seems like the generated query is different when the stars line up perfectly. Users kept complaining on production about random crashes so I added logging; I finally was able to find the offending query, but it doesn't produce any wrong query when I explicitly test for it.

Since I'm not able to reproduce it, I'll paste the query as I'm experiencing it without redacting any relavant information.

Tables

facility.table.ts

export class FacilityTable {
    readonly table = 'facility'

    // ... columns


    readonly softDelete = 'deleted'

    relations = {
        users: this.hasAndBelongsToMany(() => UserTable, {
            columns: ['id'],
            references: ['facilityId'],
            through: {
                table: 'user_facility_rel',
                columns: ['userId'],
                references: ['id']
            }
        })
    }
}

user.table.ts

export class UserTable extends BaseTable {
    readonly table = 'user'

    // ... columns

    readonly softDelete = 'deleted'

    relations = {
        facilities: this.hasAndBelongsToMany(() => FacilityTable, {
            columns: ['id'],
            references: ['userId'],
            through: {
                table: 'user_facility_rel',
                columns: ['facilityId'],
                references: ['id']
            }
        }),

        integration: this.hasOne(() => IntegrationTable, {
            columns: ['id'],
            references: ['userId']
        })
    }
}

integration.ts

export class IntegrationTable extends BaseTable {
    readonly table = 'integration'

    columns = this.setColumns(t => ({
        id: t.identity().primaryKey(),
        modules: t.array(t.text()),
        config: t.json(),
        userId: t.integer()
    }))


    relations = {
        user: this.belongsTo(() => UserTable, {
            columns: ['userId'],
            references: ['id'],
            required: true
        })
    }
}

Query

Removed unrelated selects:

const result = await db.facility.select({
    integrations: q => q.users.chain('integration').select('userId', 'key', 'modules'),
})

Logged query

In most cases this proces the expected valid SQL:

SELECT COALESCE("integrations".r, '[]') "integrations"
FROM "facility"
LEFT JOIN LATERAL (
    SELECT json_agg(row_to_json("t".*)) r 
    FROM (
        SELECT 
            "integration"."user_id" "userId", 
            "integration"."key", 
            "integration"."modules" 
        FROM "integration" 
        WHERE EXISTS (
            SELECT 1 
            FROM "user" AS "users" 
            WHERE (
                EXISTS (
                    SELECT 1 
                    FROM "user_facility_rel" 
                    WHERE "user_facility_rel"."user_id" = "users"."id" 
                    AND "user_facility_rel"."facility_id" = "facility"."id"
                ) 
                AND "users"."id" = "integration"."user_id"
            ) 
            AND ("users"."deleted" IS NULL)
        )
    ) "t"
) "integrations" ON true

But when erroring, the logged SQL is:

SELECT COALESCE("integrations".r, '[]') "integrations"
FROM "facility"
LEFT JOIN LATERAL (
    SELECT json_agg(row_to_json("t".*)) r 
    FROM (
        SELECT 
            "integration"."user_id" "userId", 
            "integration"."key", 
            "integration"."modules" 
        FROM "integration" 
        WHERE "integration"."user_id" = "user"."id" -- <- this line
        AND EXISTS (
            SELECT 1 
            FROM "user" AS "users" 
            WHERE (
                EXISTS (
                    SELECT 1 
                    FROM "user_facility_rel" 
                    WHERE "user_facility_rel"."user_id" = "users"."id" 
                    AND "user_facility_rel"."facility_id" = "facility"."id"
                ) 
                AND "users"."id" = "integration"."user_id"
            ) 
            AND ("users"."deleted" IS NULL)
        )
    ) "t"
) "integrations" ON true

Notice the WHERE "integration"."user_id" = "user"."id" present there. This causes the error:

missing FROM-clause entry for table "user"

mordechaim avatar Mar 17 '25 17:03 mordechaim

Possibly related (also random relations problem): #469

IlyaSemenov avatar Mar 20 '25 01:03 IlyaSemenov

I'll try to fix it asap, but I can't repro it either.

It's caused by mutability, it must be broken after executing some other query, and at some point the mutated WHERE "integration"."user_id" = "user"."id" is discarded.

romeerez avatar Mar 22 '25 22:03 romeerez

It's caused by mutability

I'm not entirely sure what you mean, but there is indeed a JavaScript level condition which adds another where to the query (using q variable reassignment, as the correct way of updating queries). That condition is completely unrelated to the integration join though.

mordechaim avatar Mar 23 '25 01:03 mordechaim

but there is indeed a JavaScript level condition which adds another where to the query (using q variable reassignment, as the correct way of updating queries).

Could you tell me more, do you mean inside the ORM code, or in your own code?

That condition is completely unrelated

Who knows, maybe it adds it in one place and it's remembered in the other place.

romeerez avatar Mar 23 '25 16:03 romeerez

What I mean was that while i simplified my code in this issue, it really looks something along the lines of this:

let q = db.facility.select({
    integrations: q => q.users.chain('integration').select('userId', 'key', 'modules'),
})

if(filters?.ids) {
    q = q.where({ id: { in: filter.ids } })
}

return await q

Who knows, maybe it adds it in one place and it's remembered in the other place.

Are you saying this might happen when another part of my code creates a query, it tampers with the global db object? That would be completely unexpected...

mordechaim avatar Mar 23 '25 17:03 mordechaim

if(filters?.ids) {
    q = q.where({ id: { in: filter.ids } })
}

This is harmless.

So, I looked through the codebase, can't spot the bug.

I ran the same relation query tests in the same JS context multiple times to ensure that no where condition incidentally sticks to where it shouldn't - okay.

Going to review relational chaining code, but not sure how or where to find the cause.

The key challenge is to find a way to reproduce it.

Maybe you could let me in into your codebase for debugging?

I think if you trigger various endpoints related to integrations, at some point the bug has to show up.

Are you saying this might happen when another part of my code creates a query, it tampers with the global db object?

Yes, that's what I'm suspecting, that some code tampers with the "integration" query object and leaves an unwanted condition in it.

WHERE "integration"."user_id" = "user"."id" -- <- this line

This makes sense in another query, and this condition probably sticks here left by other query.

romeerez avatar Mar 23 '25 18:03 romeerez

@mordechaim could you add this code temporary? Somewhere no matter where, after defining db.

const rj = db.integration.relations.user.relationConfig.reverseJoin;
db.integration.relations.user.relationConfig.reverseJoin = (...args) => {
  console.trace('suspicious join');
  return rj(...args);
};

That reverseJoin is responsible for adding this condition

WHERE "integration"."user_id" = "user"."id"

and I'm trying to find out where is it called from.

Once reproduced, it will log a stack trace.

romeerez avatar Mar 23 '25 18:03 romeerez

Ok, done. Will keep you posted

mordechaim avatar Mar 23 '25 18:03 mordechaim

The error happened again, no log with the text "suspicious join" in the logs at all.

mordechaim avatar Apr 28 '25 15:04 mordechaim

@mordechaim I'm sorry to not know how to debug this transient bug.

Maybe if you had logs that can shows a history of API calls like "endpoint A is called, then endpoint B, then endpoint C" and the error happened in the C, and you could call those endpoints in order to reproduce it, and we'd know what sequence of queries causes it.

We know for sure that the bug is related to users and integrations. It looks like another query that also involves users and integrations somehow messes up with this query (caches something internally). Maybe you could share all your queries that involve both users and integrations?

I removed caching of generated SQL in orchid-orm 1.45.1. There is a chance it has something do to with the error. Are you using orchid-orm >= 1.45.1?

romeerez avatar May 03 '25 12:05 romeerez

I'm still at 1.42.1 and can't upgrade due to breaking changes and unresolved bugs.

mordechaim avatar May 04 '25 02:05 mordechaim