risingwave icon indicating copy to clipboard operation
risingwave copied to clipboard

DML can not provide atomicity.

Open chenzl25 opened this issue 2 years ago • 6 comments

It seems our dml (Insert/Delete/Update) for a table can not provide atomicity even for a single statement currently. If dml's input is large enough, it may produce multiple data-chunks which can overlap with barrier. One solution is to buffer all its input and write out a large chunk. Anyone has a better idea?

chenzl25 avatar Aug 17 '22 11:08 chenzl25

We can still generate multiple chunks. Atomicity can be guaranteed by not injecting barrier in between these chunks.

hzxa21 avatar Aug 17 '22 12:08 hzxa21

Cool, we can enhance dml by this idea later.

chenzl25 avatar Aug 18 '22 03:08 chenzl25

Yes, we have discussed this in early days. DML in our system are used mostly for demo/test case, so concurrency control is not important for us.

liurenjie1024 avatar Aug 18 '22 07:08 liurenjie1024

Atomicity can be guaranteed by not injecting barrier in between these chunks.

This might be problematic if the query is too large. 😢 I'm considering persisting the input chunks somewhere first (like frontend?) and checkpointing the offset during the insertion process, which is very similar to the connector source logic. 🤣

BugenZhao avatar Aug 18 '22 08:08 BugenZhao

BTW, there're some more serious consistency problems unresolved for update and delete.

  • https://singularity-data.quip.com/KlRvAysabUma/Handle-Concurrent-Writes-Correctly
  • https://github.com/singularity-data/risingwave/issues/1186

BugenZhao avatar Aug 18 '22 08:08 BugenZhao

Yes, we have discussed this in early days. DML in our system are used mostly for demo/test case, so concurrency control is not important for us.

Not so. DML is useful for users that don't have Kafka.

Some users's data pipeline is as simple as:

TP Database ----[ CDC Tools ]--> AP Database

Or,

TP Database ----[ CDC Plugin + ETL (Flink) ]--> AP Database

For these, they would like to use RisingWave as a database and simply injects data with INSERT

fuyufjh avatar Sep 22 '22 06:09 fuyufjh

https://github.com/risingwavelabs/rfcs/pull/59

xxchan avatar May 14 '23 10:05 xxchan