workers-sdk icon indicating copy to clipboard operation
workers-sdk copied to clipboard

๐Ÿ› BUG: when trying to left join 2 tables that have same name columns - second column is not returned

Open AndriiSherman opened this issue 2 years ago โ€ข 15 comments

Which Cloudflare product(s) does this pertain to?

D1

What version of Wrangler are you using?

2.19.0

What operating system are you using?

macOS

Describe the Bug

As per issue #555 we can see, that d1 stopped returning duplicate names from different tables. It was working before and it's how any other driver is doing for raw mode

$ pnpm wrangler d1 execute dzltest --command='select "users"."id", "posts"."id", "posts"."ownerId", "posts"."title", "posts"."content" from "users" left join "posts" on "posts"."ownerId" = "users"."id";'

โ”Œโ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ id โ”‚ ownerId โ”‚ title โ”‚ content   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ 2  โ”‚ 1       โ”‚ test  โ”‚ 111111111 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ 3  โ”‚ 1       โ”‚ ueoau โ”‚ ueueu     โ”‚
โ”œโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ 1  โ”‚ 1       โ”‚ xxx   โ”‚ yyy       โ”‚
โ””โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

But should return id for user and id for posts as well Expected

โ”Œโ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ id โ”‚ id โ”‚ ownerId โ”‚ title โ”‚ content   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ 2  โ”‚ 1  โ”‚    1    โ”‚ test  โ”‚ 111111111 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ 3  โ”‚ 2  โ”‚    1    โ”‚ ueoau โ”‚ ueueu     โ”‚
โ”œโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ 1  โ”‚ 3  โ”‚    1    โ”‚ xxx   โ”‚ yyy       โ”‚
โ””โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

AndriiSherman avatar May 07 '23 15:05 AndriiSherman

๐Ÿ‘‹๐Ÿป Greetings, I appreciate the report.

Feel free to check out the Cloudflare Discord where there is a channel for D1 as well: https://discord.com/invite/cloudflaredev

Cc: @rozenmd

JacobMGEvans avatar May 08 '23 15:05 JacobMGEvans

@AndriiSherman could I bother you for a schema/db sql file to reproduce this?

rozenmd avatar May 15 '23 09:05 rozenmd

@rozenmd Although I am not Andrii, I think I have the same problem and here's a minimal example schema that has the issue (pretty much any schema that you can join where both have one shared field name (here it's "id"):

import type { InferModel } from "drizzle-orm";
import {
  integer, primaryKey,
  sqliteTable,
  text
} from "drizzle-orm/sqlite-core";

export const users = sqliteTable(
  "users",
  {
    id: integer("id").primaryKey({
      autoIncrement: true
    }),
    email: text("email").notNull(),
  }
);

export const subscriptions = sqliteTable(
  "subscriptions",
  {
    id: integer("id").primaryKey({ autoIncrement: true }),
    user: integer("user").notNull().references(() => users.id, {
      onDelete: "cascade"
    })
  }
);

xdivby0 avatar May 15 '23 09:05 xdivby0

Agreed. Any SELECT query involving joins and having the same fields in the response will not work as expected.

In our case, we are using the raw mode for querying, but it appears that the raw mode in the "d1" driver differs from the raw mode in other drivers

Link to a code: https://github.com/cloudflare/workers-sdk/blob/main/packages/wrangler/templates/d1-beta-facade.js#L177

After receiving the response in JSON format, it is subsequently mapped to an array of arrays at the code level. Consequently, fields with the same name will be overridden.

AndriiSherman avatar May 15 '23 09:05 AndriiSherman

We've traced this back to workerd, and will be discussing possible solutions here: https://github.com/cloudflare/workerd/pull/696

Keep in mind @AndriiSherman that a temporary fix on Drizzle's side is still possible by using .all() and manipulating the data into the array of arrays format yourselves.

rozenmd avatar May 24 '23 09:05 rozenmd

Keep in mind @AndriiSherman that a temporary fix on Drizzle's side is still possible by using .all() and manipulating the data into the array of arrays format yourselves.

Iโ€™m pretty sure it behaves wrongly even with this option. Iโ€™m gonna need to check, but how could it return multiple identical keys in an array of objects?

Normally SQL databases return the table name in the joined tableโ€™s column names. Or even in both.

Do you want me to check, even if itโ€™s impossible to return multiple keys of the same name, @rozenmd?

itsmatteomanf avatar Jun 25 '23 19:06 itsmatteomanf

We have the same problem, does anything block fix for this bug?

marina-chibizova avatar Aug 01 '23 20:08 marina-chibizova

We are working on this. Current priority is larger databases, which is the biggest ask from most users.

elithrar avatar Aug 01 '23 20:08 elithrar

Thanks for a prompt response @elithrar!

marina-chibizova avatar Aug 01 '23 20:08 marina-chibizova

We are working on this. Current priority is larger databases, which is the biggest ask from most users.

Is there any update on resolving this issue? I have tables with the same names for common fields like created_at, updated_at, id etc and joins don't work because all the response data is shifting columns in the results.

jwhits avatar Jan 07 '24 22:01 jwhits

FYI, this will be fixed D1 once https://github.com/cloudflare/workerd/pull/1586 is merged and rolls out

It will only fix the results of .raw(), .all() will still have later columns with the same name overwriting earlier ones.

geelen avatar Jan 30 '24 00:01 geelen

This is now released. .raw() returns the full results set regardless of column name clashes. Sadly Wrangler still uses .all() so this won't be visible except from inside the Worker API, so I'll leave this ticket open for now.

geelen avatar Feb 13 '24 22:02 geelen

Changelog: https://developers.cloudflare.com/d1/platform/changelog/#2024-02-13

cc @AndriiSherman on the Drizzle team.

elithrar avatar Feb 14 '24 13:02 elithrar

Changelog: https://developers.cloudflare.com/d1/platform/changelog/#2024-02-13

cc @AndriiSherman on the Drizzle team.

Has it been released already? I would like to run some tests, can't find this version for types on npm

AndriiSherman avatar Feb 15 '24 12:02 AndriiSherman

@AndriiSherman sorry for the delay there - the types are released in 4.20240222.0

rozenmd avatar Feb 23 '24 14:02 rozenmd

@rozenmd I've tested this with latest types version (4.20240320.1), and the .raw() method still doesn't return multiple columns with the same name. This change shouldn't require any actions from our side, since we don't bundle @cloudflare/workers-types with Drizzle, it's used as a peer dependency, so it should be enough for users to just update their dependency version. However, it doesn't seem to be the case.

Here's a reproduction repo: https://github.com/dankochetov/drizzle-d1-bug I've added a raw D1 call to make sure it wasn't Drizzle's fault, and observed the same behavior. As you could see from the logs, the columns list in the response don't match the columns list in the query.

image image

dankochetov avatar Mar 20 '24 22:03 dankochetov

Hi @dankochetov, that looks like a bug in local mode (probably related to getPlatformProxy(), though I'll need to dig in and figure out exactly what).

The .raw() behaviour has been fixed in production though, I've just converted your repro to actually publish a worker and I get the expected output:

image

I'll take a look at the local behaviour this week, apologies for the oversight.

geelen avatar Mar 24 '24 21:03 geelen

has been fixed in production though

@geelen I'm using the latest wrangler 3.37.0 to deploy my app with compatibility_date = "2024-03-25" and I'm still facing this issue.

Is there anything special I need to do to get the new version?

stefanmaric avatar Mar 25 '24 09:03 stefanmaric

Running my query from the D1 Dashboard console UI or wrangler D1 CLI get me the same, buggy result. It collapses columns with the same name from joined tables into a single one, the last joined table seemingly taking precedence.

SELECT id, A.description, B.description FROM table_a A LEFT JOIN table_b B ON B.id = A.id WHERE id = 'abc'

stefanmaric avatar Mar 25 '24 09:03 stefanmaric

Are there any updates on this?

F0rce avatar Apr 19 '24 09:04 F0rce

I added a test case for the problem: https://github.com/cloudflare/workers-sdk/pull/5917

I went down a rabbit hole chasing this bug through workerd (where I also added tests, but it passed), and now, Workers SDK.

Hopefully someone from the maintainers will notice it and help me fix the problem. Any tips will be appriciated.

kossnocorp avatar May 25 '24 07:05 kossnocorp

I found the root cause of the problem and fixed it: https://github.com/cloudflare/workers-sdk/pull/5917

Hopefully, the PR will be merged and shipped soon.

kossnocorp avatar May 26 '24 00:05 kossnocorp

@kossnocorp - that PR specifically fixes Workers when run locally, not the execute command. I'll have a follow-up PR for the execute command shortly.

rozenmd avatar May 27 '24 14:05 rozenmd

Ugh GitHub - nope not fixed yet

rozenmd avatar May 27 '24 15:05 rozenmd