postgres_scanner icon indicating copy to clipboard operation
postgres_scanner copied to clipboard

Parallel COPY to Postgres ?

Open anayrat opened this issue 1 year ago • 3 comments

What happens?

Hello, I tried to do a parallel import in Postgres. I wonder if postgres_scanner allows this ?

To Reproduce

I tried :

ATTACH 'user=postgres host=/var/run/postgresql port=5440' AS db (TYPE postgres);
COPY db.votes2 FROM 'data.zstd/*' (FORMAT parquet);

I tried to change file format (csv, parquet), compression, multiples files... I also tried to first import in duckdb, then, copy to Postgres. I only see one COPY process on Postgres.

OS:

Linux

PostgreSQL Version:

16

DuckDB Version:

v0.10.0 20b1486d11

DuckDB Client:

CLI

Full Name:

Adrien Nayrat

Affiliation:

Self Employed

Have you tried this on the latest main branch?

  • [X] I agree

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • [X] I agree

anayrat avatar Feb 26 '24 11:02 anayrat

Thanks for the report! Parallel imports are not supported right now. It is also rather hard (and perhaps not even possible?) to support parallel imports in a transaction-safe manner. We could perhaps offer a bulk-import option that does not respect transactions in the future.

Mytherin avatar Feb 26 '24 12:02 Mytherin

Thanks for your quick reply ! Indeed, it is a good idea for the bulk import without taking care of transaction. FYI, it seems possible to perform the copy thanks to snapshot export :

Session 1 Session 2
BEGIN;
SELECT pg_export_snapshot();

pg_export_snapshot
---------------------
00000003-00000016-1
(1 row)

copy t1 from 'file.copy' ;
COPY 10
BEGIN ISOLATION LEVEL REPEATABLE READ;
SET TRANSACTION SNAPSHOT '00000003-00000016-1';
copy t1 from 'file.copy' ;
COPY 10
COMMIT;
COMMIT;
select count(*) from votes2;
count
-------
20
(1 row)

anayrat avatar Feb 26 '24 14:02 anayrat

With such feature, DBT could be a very good tool to load massive amount of data in Postgres. It could be my new favorite ELT/ETL :)

anayrat avatar Mar 22 '24 12:03 anayrat