risingwave
risingwave copied to clipboard
DML can not provide atomicity.
It seems our dml (Insert/Delete/Update) for a table can not provide atomicity even for a single statement currently. If dml's input is large enough, it may produce multiple data-chunks which can overlap with barrier. One solution is to buffer all its input and write out a large chunk. Anyone has a better idea?
We can still generate multiple chunks. Atomicity can be guaranteed by not injecting barrier in between these chunks.
Cool, we can enhance dml by this idea later.
Yes, we have discussed this in early days. DML in our system are used mostly for demo/test case, so concurrency control is not important for us.
Atomicity can be guaranteed by not injecting barrier in between these chunks.
This might be problematic if the query is too large. 😢 I'm considering persisting the input chunks somewhere first (like frontend?) and checkpointing the offset during the insertion process, which is very similar to the connector source logic. 🤣
BTW, there're some more serious consistency problems unresolved for update
and delete
.
- https://singularity-data.quip.com/KlRvAysabUma/Handle-Concurrent-Writes-Correctly
- https://github.com/singularity-data/risingwave/issues/1186
Yes, we have discussed this in early days. DML in our system are used mostly for demo/test case, so concurrency control is not important for us.
Not so. DML is useful for users that don't have Kafka.
Some users's data pipeline is as simple as:
TP Database ----[ CDC Tools ]--> AP Database
Or,
TP Database ----[ CDC Plugin + ETL (Flink) ]--> AP Database
For these, they would like to use RisingWave as a database and simply injects data with INSERT
https://github.com/risingwavelabs/rfcs/pull/59