citus_docs icon indicating copy to clipboard operation
citus_docs copied to clipboard

Evaluate updating Bulk Loading section's Note

Open ozgune opened this issue 8 years ago • 2 comments

We have the following note as part of the bulk loading section:

"There is no notion of snapshot isolation across shards, which means that a multi-shard SELECT that runs concurrently with a COPY might see it committed on some shards, but not on others. If the user is storing events data, he may occasionally observe small gaps in recent data. It is up to applications to deal with this if it is a problem (e.g. exclude the most recent data from queries, or use some lock).

If COPY fails to open a connection for a shard placement then it behaves in the same way as INSERT, namely to mark the placement(s) as inactive unless there are no more active placements. If any other failure occurs after connecting, the transaction is rolled back and thus no metadata changes are made."

I had three questions on this section:

  1. When the user runs COPY, Citus currently uses transactions to commit or rollback batch loading of data. I ran multiple COPY operations and concurrent SELECT count(*) FROM github_events; and I saw transactional behavior here. Are we worried about the window where parallel commits across machines take time to complete? -- Isn't that a small window?
  2. The first and second paragraphs in this note seem unrelated. Do we have two notes?
  3. Do we want to document \COPY or COPY? PostgreSQL's documentation generally talks about COPY. That said, \COPY is more convenient to use.

ozgune avatar Dec 21 '16 01:12 ozgune

(I'm scanning these issues to see which are still relevant, and can confirm that this note still exists in https://docs.citusdata.com/en/v7.3/dist_tables/dml.html#bulk-loading )

begriffs avatar Mar 28 '18 21:03 begriffs

@onderkalaci do you know whether the warnings in https://docs.citusdata.com/en/v8.3/develop/reference_dml.html#copy-command-bulk-load are still accurate?

jonels-msft avatar Oct 28 '19 21:10 jonels-msft