pg_duckdb icon indicating copy to clipboard operation
pg_duckdb copied to clipboard

Support `insert_into_select` for Postgres table

Open YuweiXiao opened this issue 9 months ago • 8 comments

Partially fixes #30

YuweiXiao avatar Mar 21 '25 14:03 YuweiXiao

hi~ @JelteF The test cases aren't fully covered yet. would appreciate some feedback before finalizing the details. Thanks!

YuweiXiao avatar Mar 21 '25 14:03 YuweiXiao

My example code from #30:

create table t(a int, x text);
-- 1. easiest to make work (still not super easy)
insert into t select r['a']::int, r['x']::text from duckdb.query($$ SELECT 1 a, 'abc' x $$) r;

Fails with this error:

ERROR:  42804: table row type and query-specified row type do not match
DETAIL:  Table has type text at ordinal position 2, but query expects character varying.
LOCATION:  ExecCheckPlanOutput, nodeModifyTable.c:212

I think that would be worked around by #583, but then the opposite would hold (i.e. inserting into a varchar column woudl fail). So I think we need some logic to make sure that the types match against eachother, and maybe add some code to convert the type that we get back from DuckDB to the type that PG wants for the INSERT. I think that would also fix the generate_series() issue that all of the tests are failing on currently.

JelteF avatar Mar 21 '25 15:03 JelteF

I think that would be worked around by #583, but then the opposite would hold (i.e. inserting into a varchar column woudl fail). So I think we need some logic to make sure that the types match against eachother, and maybe add some code to convert the type that we get back from DuckDB to the type that PG wants for the INSERT. I think that would also fix the generate_series() issue that all of the tests are failing on currently.

YES, the type check and conversion are substantial. I manually added an Expr to handle the coerce and it seems working so far (e.g., varchar and generate_series issue). I'll add more test cases to ensure if I haven't missed (or broken) anything, and code re-org (& cleanup) are also in progress. Thanks for the feedback, @JelteF !

YuweiXiao avatar Mar 22 '25 09:03 YuweiXiao

The MR is ready for review~ @JelteF Please take a look when you have a chance.

Currently, the regression is failing for hugeint_numeric. The issue seems related to the following behavior, which is reproducible in the main branch. Do you have any insights on this?

select * from duckdb.query($$ SELECT a FROM (VALUES (32942348563242.111222333444555666777888::numeric(38,24)), (NULL::numeric), (123456789.000000000000000000000001::numeric(38,24))) t(a) $$);

                   a
----------------------------------------
 32942348563242.11122233344455566677789  (correct one should be `32942348563242.111222333444555666777888`

      123456789.00000000000000000000000 (correct one should be 123456789.000000000000000000000001)
(3 rows)

update: DuckDB cli has the exact same output.

YuweiXiao avatar Mar 24 '25 12:03 YuweiXiao

Currently, the regression is failing for hugeint_numeric. The issue seems related to the following behavior, which is reproducible in the main branch. Do you have any insights on this?

I think you can probably fix the test by changing the plain NULL to NULL::NUMERIC(38, 24). It seems at least that that solves the issue when running your reproducing example in the duckdb CLI.

JelteF avatar Apr 16 '25 19:04 JelteF

To set expectations: I think this is a very cool feature and your implementation looks like the right direction. But it is relatively complex and impactful, so I would like to give it an appropriate round of testing and "baking" (aka testing by merging it and having it on the main branch for a while). Since we want to release 1.0 somewhat soon, I don't think it's realistic to still do that before the 1.0 release. So I added this to the new 1.1 milestone instead. I also won't spend time reviewing this PR in depth until then.

JelteF avatar Apr 17 '25 08:04 JelteF

To set expectations: I think this is a very cool feature and your implementation looks like the right direction. But it is relatively complex and impactful, so I would like to give it an appropriate round of testing and "baking" (aka testing by merging it and having it on the main branch for a while). Since we want to release 1.0 somewhat soon, I don't think it's realistic to still do that before the 1.0 release. So I added this to the new 1.1 milestone instead. I also won't spend time reviewing this PR in depth until then.

Totally agree, 1.1 makes sense. btw, any target date for 1.0 release?

YuweiXiao avatar Apr 17 '25 13:04 YuweiXiao

any target date for 1.0 release?

The current plan is to do it quickly after the DuckDB 1.3.0 release, so early May. But dates may obviously slip.

JelteF avatar Apr 17 '25 13:04 JelteF