Support `insert_into_select` for Postgres table
Partially fixes #30
hi~ @JelteF The test cases aren't fully covered yet. would appreciate some feedback before finalizing the details. Thanks!
My example code from #30:
create table t(a int, x text);
-- 1. easiest to make work (still not super easy)
insert into t select r['a']::int, r['x']::text from duckdb.query($$ SELECT 1 a, 'abc' x $$) r;
Fails with this error:
ERROR: 42804: table row type and query-specified row type do not match
DETAIL: Table has type text at ordinal position 2, but query expects character varying.
LOCATION: ExecCheckPlanOutput, nodeModifyTable.c:212
I think that would be worked around by #583, but then the opposite would hold (i.e. inserting into a varchar column woudl fail). So I think we need some logic to make sure that the types match against eachother, and maybe add some code to convert the type that we get back from DuckDB to the type that PG wants for the INSERT. I think that would also fix the generate_series() issue that all of the tests are failing on currently.
I think that would be worked around by #583, but then the opposite would hold (i.e. inserting into a varchar column woudl fail). So I think we need some logic to make sure that the types match against eachother, and maybe add some code to convert the type that we get back from DuckDB to the type that PG wants for the INSERT. I think that would also fix the
generate_series()issue that all of the tests are failing on currently.
YES, the type check and conversion are substantial. I manually added an Expr to handle the coerce and it seems working so far (e.g., varchar and generate_series issue). I'll add more test cases to ensure if I haven't missed (or broken) anything, and code re-org (& cleanup) are also in progress. Thanks for the feedback, @JelteF !
The MR is ready for review~ @JelteF Please take a look when you have a chance.
Currently, the regression is failing for hugeint_numeric. The issue seems related to the following behavior, which is reproducible in the main branch. Do you have any insights on this?
select * from duckdb.query($$ SELECT a FROM (VALUES (32942348563242.111222333444555666777888::numeric(38,24)), (NULL::numeric), (123456789.000000000000000000000001::numeric(38,24))) t(a) $$);
a
----------------------------------------
32942348563242.11122233344455566677789 (correct one should be `32942348563242.111222333444555666777888`
123456789.00000000000000000000000 (correct one should be 123456789.000000000000000000000001)
(3 rows)
update: DuckDB cli has the exact same output.
Currently, the regression is failing for hugeint_numeric. The issue seems related to the following behavior, which is reproducible in the main branch. Do you have any insights on this?
I think you can probably fix the test by changing the plain NULL to NULL::NUMERIC(38, 24). It seems at least that that solves the issue when running your reproducing example in the duckdb CLI.
To set expectations: I think this is a very cool feature and your implementation looks like the right direction. But it is relatively complex and impactful, so I would like to give it an appropriate round of testing and "baking" (aka testing by merging it and having it on the main branch for a while). Since we want to release 1.0 somewhat soon, I don't think it's realistic to still do that before the 1.0 release. So I added this to the new 1.1 milestone instead. I also won't spend time reviewing this PR in depth until then.
To set expectations: I think this is a very cool feature and your implementation looks like the right direction. But it is relatively complex and impactful, so I would like to give it an appropriate round of testing and "baking" (aka testing by merging it and having it on the
mainbranch for a while). Since we want to release 1.0 somewhat soon, I don't think it's realistic to still do that before the 1.0 release. So I added this to the new 1.1 milestone instead. I also won't spend time reviewing this PR in depth until then.
Totally agree, 1.1 makes sense. btw, any target date for 1.0 release?
any target date for 1.0 release?
The current plan is to do it quickly after the DuckDB 1.3.0 release, so early May. But dates may obviously slip.