spark-acid icon indicating copy to clipboard operation
spark-acid copied to clipboard

Issue-70 Fix the repartitioning logic to handle statement IDs

Open amoghmargoor opened this issue 4 years ago • 0 comments

For UPDATE/DELETE, we were repartitioning based on encoded bucketIds so that all rows with same bucket are processed by the same task. However, rows can have same bucket but different encoded bucketIds as encoded bucketIds are composed of both bucket+statementId. Hence, row with same bucket end up going to different tasks which can cause conflict as different task will be writing to the same delete delta bucket file. CPed from SPAR-4637

amoghmargoor avatar Jul 22 '20 21:07 amoghmargoor