HoloClean-Legacy-deprecated
HoloClean-Legacy-deprecated copied to clipboard
Error Detection Optimization
The current error detection does not analyze denial constraints and calculates same tables for all of them we can divide them into two type symmetric and non-symmetric, also we can parallelize the queries (calculation in error detection is N^2 which N is the number of tuples)
Benefits: For the symmetric DCs, we reduce the calculation to half, also with parallelizing them we can increase speed even more
Costs: Two methods 1-determining the symmetric (which we already half of it implemented) 2- re-write the query function for symmetric function and one method to feed this queries in our parallelism framework