citus icon indicating copy to clipboard operation
citus copied to clipboard

Support pg_bulkload

Open ujangja opened this issue 2 years ago • 1 comments

I have citus 11 env.

I used pg_bulkload to load the data into citus distributed table from csv file. this tool has better handling for bad parse data and duplicate too.

loading messages said no error, about 5mio records finished but count the table always 0.

Try 2nd load; got error primary key constraint violation.

pg_bulkload I used is buffered writer, which in code they are using heap_insert API. Is citus parse data to distribute before heap_insert?

ujangja avatar Sep 09 '22 23:09 ujangja

Interesting, I had not heard of pg_bulkload. Its approach is not compatible with Citus, which intercepts writes in the query planner. By using pg_bulkload, you are writing to the original table on the coordinator, which is not accessible via SQL.

The recommended way to bulk load data is via COPY.

marcocitus avatar Sep 12 '22 10:09 marcocitus