postgres_scanner
postgres_scanner copied to clipboard
Feature request: Expose temporary tables
I want to copy some data to a temporary table in postgres. When I do this using the duckdb api, the resulting table is not available directly:
CALL postgres_execute('postgres_db', $$
CREATE TEMP TABLE my_datasets(
val TEXT
);
$$);
select * from postgres_db.my_datasets;
Catalog Error: Table with name my_datasets does not exist!
Did you mean "postgres_db.pg_temp_25.my_datasets"?
LINE 1: select * from postgres_db.my_datasets;
However, if I run the query as string, it will go through:
FROM postgres_query('postgres_db', $$ select * from my_datasets; $$);
┌─────────┐
│ val │
│ varchar │
├─────────┤
│ 0 rows │
└─────────┘
For query, this way is okay. But for COPY FROM statement if I want to copy some data to postgres, I am not sure if there is a workaround.
Thanks for the report! It seems this is a namespacing issue, the table is in the pg_temp_X schema which DuckDB does not look through. As a work-around you can query the temp schema directly, e.g.:
SELECT * FROM postgres_db.pg_temp_25.my_datasets;
@Mytherin Thanks! Yes, I understand that. postgres will create an alias pg_temp pointing to the session specific pg_temp_NN, but duckdb loses track of that. Since the _NN suffix is random assigned, I will need to rely on external logic to get the actual table name.
In my use case, I have some small datasets locally and want to join with the large postgres DB. For performance reason, I want the join to happen on the postgres side. I can start my analysis with
duckdb -init copy_datasets_to_postgres_temp_table.sql
My current workaround is pretty awkward, relying on .shell and .read statement to call out external script to generate and execute the COPY statement (since it has the dynamic pg_temp_NN part).
I need temp table also this sql can not execute: update gp.tenant_peacebird_biz.rst_ra_sku_org_detail a set compute_status='0' from tmp_distinct_order_id b where a.skc_order_id=b.skc_order_id and a.day_date = '2024-04-29' and a.is_deleted = '0'; so I need write tmp_distinct_order_id to gp temp table first,but how to do?
I've implemented support for this in https://github.com/duckdb/postgres_scanner/pull/234
I had exactly the same problem.
The error I was getting
duckdb.duckdb.CatalogException: Catalog Error: Table with name some_table does not exist!
Did you mean "db.pg_temp_7.some_table"?
Could it be possible to change the error message to
Did you mean "db.pg_temp.some_table"?
Use of pg_temp seems to be not documented as well.
Asking as it took me a bit of time to find this issue, and then look through the actual pull request to understand how this is fixed.