incubator-pegasus icon indicating copy to clipboard operation
incubator-pegasus copied to clipboard

What if bulk load successfully only on some nodes while failed on other nodes?

Open byteroll opened this issue 2 years ago • 3 comments

From this http://pegasus.incubator.apache.org/en/2020/02/18/bulk-load-design.html, it says

需要说明的是,如果在bulk load在ingestion阶段失败或者在ingestion阶段执行cancel bulk load操作,可能会出现部分partition完成ingestion,而部分失败或者被cancel的情况,即部分partition成功导入了数据,部分partition没有导入数据的现象。

but it doesn't tell how to handle this situation. Is there any solution to solve this problem?

byteroll avatar Jul 28 '22 00:07 byteroll

There are two cases which will cause bulk load ingestion data inconsistence:

  • some partitions meet unrecoverable ingestion error during ingestion and some not
    • simple network error or replica 2pc are NOT unrecoverable error, only the ingested files can not be recognized by rocksdb is unrecoverable
  • force cancel during ingestion - this is only triggered by user not by system itself

There are currently no solution to handle such situation automatically by system, because pegasus doesn't support transaction through different partitions. For example, table has 8 partitions. Partition 0 ingest succeed, but partition 1 receive wrong-format sst files which is a unrecoverable error, it can not ingest those files, bulk load failed, and partition 0 won't reset those data. Ingestion is just like batch write, different partitions won't affect others' data, they are just different partitions. If user use our client batch write interface, it also can not gurantee that data wrote into different partitions should always be consistent.

The only solution is to retry bulk load after user fix broken files, which is triggered by user manually.

hycdong avatar Jul 28 '22 01:07 hycdong

Thanks for the explanation. By the way, is there any plan to support transaction through different partitions?

byteroll avatar Jul 29 '22 09:07 byteroll

Transaction is not in our current plan, because we don't find out its user case, but it can be discussed and implemented if any user has strong request on transaction in future.

hycdong avatar Aug 01 '22 09:08 hycdong