hyrise icon indicating copy to clipboard operation
hyrise copied to clipboard

[DYOD] Data-induced Predicates

Open T4rikA opened this issue 2 years ago • 0 comments

Data Induced Predicates

This PR introduces a new rule to the optimization of joins: The Data Induced Dependency Rule (DIP). It is inspired by this paper and aims at reducing the input cardinalities for joins. It does this by adding a predicate to one side of the join (the reduced side) in the lqp based on the range of values of the join-attribute available on the other side of the join (the reducer side).


This PR adds the new rule and tests-. We also included a commit history in which we attempted to build a heuristic to evaluate the already implemented Semi Join Reductions (SJR) head to head with DIPs. The current rule allows for multi predicate DIP insertion with the restriction of only supporting equal predicates from which DIP is deduced.


In our research we found out that this multi-predicate rule increases the default master speed, when the SJR rule is not used. For this, see the following comparison of the master without the SJR compared to the master without the SJR rule but our DIP rule being added instead (The following results are all done with a TPC-H scalefactor 10):

Results
+Configuration Overview----+---------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------+
| Parameter                | benchmark_results/week07TT/output_master_without_sjr_rule_9b57a59d1c888e90f44745591aa84ac835dfeba0.json | benchmark_results/week07TT/output_multi_predicate_dip_without_sjr_rule_d3fd38621b2f68b49bc5b43a1fb74f0e7bfd2175.json |
+--------------------------+---------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------+
|  GIT-HASH                | 9b57a59d1c888e90f44745591aa84ac835dfeba0-dirty                                                          | d3fd38621b2f68b49bc5b43a1fb74f0e7bfd2175-dirty                                                                       |
|  benchmark_mode          | Ordered                                                                                                 | Ordered                                                                                                              |
|  build_type              | release                                                                                                 | release                                                                                                              |
|  chunk_indexes           | False                                                                                                   | False                                                                                                                |
|  chunk_size              | 65535                                                                                                   | 65535                                                                                                                |
|  clients                 | 1                                                                                                       | 1                                                                                                                    |
|  clustering              | None                                                                                                    | None                                                                                                                 |
|  compiler                | clang 14.0.0                                                                                            | clang 14.0.0                                                                                                         |
|  cores                   | 0                                                                                                       | 0                                                                                                                    |
|  data_preparation_cores  | 0                                                                                                       | 0                                                                                                                    |
|  date                    | 2023-07-30 22:22:45                                                                                     | 2023-07-31 10:09:07                                                                                                  |
|  encoding                | {'default': {'encoding': 'Dictionary'}}                                                                 | {'default': {'encoding': 'Dictionary'}}                                                                              |
|  max_duration            | 60000000000                                                                                             | 60000000000                                                                                                          |
|  max_runs                | -1                                                                                                      | -1                                                                                                                   |
|  scale_factor            | 10.0                                                                                                    | 10.0                                                                                                                 |
|  time_unit               | ns                                                                                                      | ns                                                                                                                   |
|  use_prepared_statements | False                                                                                                   | False                                                                                                                |
|  using_scheduler         | False                                                                                                   | False                                                                                                                |
|  verify                  | False                                                                                                   | False                                                                                                                |
|  warmup_duration         | 0                                                                                                       | 0                                                                                                                    |
+--------------------------+---------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------+

+----------++----------+----------+--------++----------+----------+--------+---------+
| Item     || Latency (ms/iter)   | Change || Throughput (iter/s) | Change | p-value |
|          ||      old |      new |        ||      old |      new |        |         |
+----------++----------+----------+--------++----------+----------+--------+---------+
| TPC-H 01 ||  5014.47 |  5006.48 |   -0%  ||     0.20 |     0.20 |   +0%  |  0.9307 |
| TPC-H 02 ||   363.89 |   469.41 |  +29%  ||     2.75 |     2.13 |  -22%  |  0.0000 |
| TPC-H 03 ||  1644.99 |  1875.30 |  +14%  ||     0.61 |     0.53 |  -12%  |  0.0000 |
| TPC-H 04 ||  3932.49 |  3991.65 |   +2%  ||     0.25 |     0.25 |   -1%  |  0.0000 |
| TPC-H 05 ||  1774.54 |  2315.95 |  +31%  ||     0.56 |     0.43 |  -23%  |  0.0000 |
| TPC-H 06 ||   143.13 |   144.20 |   +1%  ||     6.99 |     6.93 |   -1%  |  0.0008 |
| TPC-H 07 ||   926.25 |  1331.99 |  +44%  ||     1.08 |     0.75 |  -30%  |  0.0000 |
| TPC-H 08 ||   468.61 |   836.39 |  +78%  ||     2.13 |     1.20 |  -44%  |  0.0000 |
| TPC-H 09 ||  5527.06 |  6224.46 |  +13%  ||     0.18 |     0.16 |  -11%  |  0.0000 |
| TPC-H 10 ||  1506.63 |  1655.11 |  +10%  ||     0.66 |     0.60 |   -9%  |  0.0000 |
| TPC-H 11 ||    72.82 |    98.29 |  +35%  ||    13.73 |    10.17 |  -26%  |  0.0000 |
| TPC-H 12 ||   858.24 |   892.80 |   +4%  ||     1.17 |     1.12 |   -4%  |  0.0000 |
| TPC-H 13 ||  5246.48 |  5194.52 |   -1%  ||     0.19 |     0.19 |   +1%  |  0.1451 |
| TPC-H 14 ||   418.05 |   435.37 |   +4%  ||     2.39 |     2.30 |   -4%  |  0.0000 |
| TPC-H 15 ||   187.52 |   188.09 |   +0%  ||     5.33 |     5.32 |   -0%  |  0.0062 |
| TPC-H 16 ||   699.01 |   715.98 |   +2%  ||     1.43 |     1.40 |   -2%  |  0.0000 |
| TPC-H 17 ||  2932.26 |  3529.49 |  +20%  ||     0.34 |     0.28 |  -17%  |  0.0000 |
| TPC-H 18 ||  1435.58 |  1557.29 |   +8%  ||     0.70 |     0.64 |   -8%  |  0.0005 |
| TPC-H 19 ||   190.20 |   281.00 |  +48%  ||     5.26 |     3.56 |  -32%  |  0.0000 |
| TPC-H 20 ||  2693.43 |  3259.29 |  +21%  ||     0.37 |     0.31 |  -17%  |  0.0000 |
| TPC-H 21 || 26915.48 | 11675.26 |  -57%  ||     0.04 |     0.09 | +131%  |       ˅ |
| TPC-H 22 ||   412.54 |   409.04 |   -1%  ||     2.42 |     2.44 |   +1%  |  0.0000 |
+----------++----------+----------+--------++----------+----------+--------+---------+
| Sum      || 63363.68 | 52087.35 |  -18%  ||          |          |        |         |
| Geomean  ||          |          |        ||          |          |  -10%  |         |
+----------++----------+----------+--------++----------+----------+--------+---------+
|          || ˅ Insufficient number of runs for p-value calculation                  |
+----------++----------+----------+--------++----------+----------+--------+---------+

As you can see, the DIP rule is beneficial for the performance although not all queries benefit from it. If we compare the DIP rule to the master with SJR, we see, that it can't compete with the optimized SJR yet:

Results
+Configuration Overview----+------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------+
| Parameter                | benchmark_results/week07TT/output_master_with_sjr_rule_6f922c7bd8731a84ffe227bddcd069f00268bc82.json | benchmark_results/week07TT/output_multi_predicate_dip_without_sjr_rule_d3fd38621b2f68b49bc5b43a1fb74f0e7bfd2175.json |
+--------------------------+------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------+
|  GIT-HASH                | 6f922c7bd8731a84ffe227bddcd069f00268bc82-dirty                                                       | d3fd38621b2f68b49bc5b43a1fb74f0e7bfd2175-dirty                                                                       |
|  benchmark_mode          | Ordered                                                                                              | Ordered                                                                                                              |
|  build_type              | release                                                                                              | release                                                                                                              |
|  chunk_indexes           | False                                                                                                | False                                                                                                                |
|  chunk_size              | 65535                                                                                                | 65535                                                                                                                |
|  clients                 | 1                                                                                                    | 1                                                                                                                    |
|  clustering              | None                                                                                                 | None                                                                                                                 |
|  compiler                | clang 14.0.0                                                                                         | clang 14.0.0                                                                                                         |
|  cores                   | 0                                                                                                    | 0                                                                                                                    |
|  data_preparation_cores  | 0                                                                                                    | 0                                                                                                                    |
|  date                    | 2023-07-30 22:48:52                                                                                  | 2023-07-31 10:09:07                                                                                                  |
|  encoding                | {'default': {'encoding': 'Dictionary'}}                                                              | {'default': {'encoding': 'Dictionary'}}                                                                              |
|  max_duration            | 60000000000                                                                                          | 60000000000                                                                                                          |
|  max_runs                | -1                                                                                                   | -1                                                                                                                   |
|  scale_factor            | 10.0                                                                                                 | 10.0                                                                                                                 |
|  time_unit               | ns                                                                                                   | ns                                                                                                                   |
|  use_prepared_statements | False                                                                                                | False                                                                                                                |
|  using_scheduler         | False                                                                                                | False                                                                                                                |
|  verify                  | False                                                                                                | False                                                                                                                |
|  warmup_duration         | 0                                                                                                    | 0                                                                                                                    |
+--------------------------+------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------+

+----------++----------+----------+---------++----------+----------+--------+---------+
| Item     || Latency (ms/iter)   |  Change || Throughput (iter/s) | Change | p-value |
|          ||      old |      new |         ||      old |      new |        |         |
+----------++----------+----------+---------++----------+----------+--------+---------+
| TPC-H 01 ||  4986.28 |  5006.48 |    +0%  ||     0.20 |     0.20 |   -0%  |  0.8437 |
| TPC-H 02 ||    39.17 |   469.41 | +1099%  ||    25.53 |     2.13 |  -92%  |  0.0000 |
| TPC-H 03 ||  2171.69 |  1875.30 |   -14%  ||     0.46 |     0.53 |  +16%  |  0.0000 |
| TPC-H 04 ||  1281.50 |  3991.65 |  +211%  ||     0.78 |     0.25 |  -68%  |  0.0000 |
| TPC-H 05 ||  2760.99 |  2315.95 |   -16%  ||     0.36 |     0.43 |  +19%  |  0.0000 |
| TPC-H 06 ||   143.29 |   144.20 |    +1%  ||     6.98 |     6.93 |   -1%  |  0.0089 |
| TPC-H 07 ||   859.95 |  1331.99 |   +55%  ||     1.16 |     0.75 |  -35%  |  0.0000 |
| TPC-H 08 ||   673.33 |   836.39 |   +24%  ||     1.49 |     1.20 |  -19%  |  0.0000 |
| TPC-H 09 ||  5068.96 |  6224.46 |   +23%  ||     0.20 |     0.16 |  -19%  |  0.0000 |
| TPC-H 10 ||  2949.77 |  1655.11 |   -44%  ||     0.34 |     0.60 |  +78%  |  0.0000 |
| TPC-H 11 ||    70.92 |    98.29 |   +39%  ||    14.10 |    10.17 |  -28%  |  0.0000 |
| TPC-H 12 ||   927.65 |   892.80 |    -4%  ||     1.08 |     1.12 |   +4%  |  0.0000 |
| TPC-H 13 ||  5149.94 |  5194.52 |    +1%  ||     0.19 |     0.19 |   -1%  |  0.2324 |
| TPC-H 14 ||   419.95 |   435.37 |    +4%  ||     2.38 |     2.30 |   -4%  |  0.0000 |
| TPC-H 15 ||   187.23 |   188.09 |    +0%  ||     5.34 |     5.32 |   -0%  |  0.0010 |
| TPC-H 16 ||   641.90 |   715.98 |   +12%  ||     1.56 |     1.40 |  -10%  |  0.0000 |
| TPC-H 17 ||   203.64 |  3529.49 | +1633%  ||     4.91 |     0.28 |  -94%  |  0.0000 |
| TPC-H 18 ||  1616.40 |  1557.29 |    -4%  ||     0.62 |     0.64 |   +4%  |  0.2450 |
| TPC-H 19 ||   249.81 |   281.00 |   +12%  ||     4.00 |     3.56 |  -11%  |  0.0000 |
| TPC-H 20 ||   376.40 |  3259.29 |  +766%  ||     2.66 |     0.31 |  -88%  |  0.0000 |
| TPC-H 21 ||  4347.86 | 11675.26 |  +169%  ||     0.23 |     0.09 |  -63%  |       ˅ |
| TPC-H 22 ||   409.04 |   409.04 |    -0%  ||     2.44 |     2.44 |   +0%  |  0.9984 |
+----------++----------+----------+---------++----------+----------+--------+---------+
| Sum      || 35535.66 | 52087.35 |   +47%  ||          |          |        |         |
| Geomean  ||          |          |         ||          |          |  -37%  |         |
+----------++----------+----------+---------++----------+----------+--------+---------+
|          || ˅ Insufficient number of runs for p-value calculation                   |
+----------++----------+----------+---------++----------+----------+--------+---------+

In the commit history you can also find efforts to build a heuristic to add both DIPs and SJRs in one rule. Unfortunately it did not turn out to improve the performance of our solution. See for yourself, in our comparisson of the master including the SJR versus our heuristic without the SJR:

Results
+Configuration Overview----+------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
| Parameter                | benchmark_results/week07TT/output_master_with_sjr_rule_6f922c7bd8731a84ffe227bddcd069f00268bc82.json | benchmark_results/week07TT/output_dip_rule_without_sjr_7254c66471e735e452950be3fcdc9dd43c3fa029.json |
+--------------------------+------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------+
|  GIT-HASH                | 6f922c7bd8731a84ffe227bddcd069f00268bc82-dirty                                                       | 7254c66471e735e452950be3fcdc9dd43c3fa029-dirty                                                       |
|  benchmark_mode          | Ordered                                                                                              | Ordered                                                                                              |
|  build_type              | release                                                                                              | release                                                                                              |
|  chunk_indexes           | False                                                                                                | False                                                                                                |
|  chunk_size              | 65535                                                                                                | 65535                                                                                                |
|  clients                 | 1                                                                                                    | 1                                                                                                    |
|  clustering              | None                                                                                                 | None                                                                                                 |
|  compiler                | clang 14.0.0                                                                                         | clang 14.0.0                                                                                         |
|  cores                   | 0                                                                                                    | 0                                                                                                    |
|  data_preparation_cores  | 0                                                                                                    | 0                                                                                                    |
|  date                    | 2023-07-30 22:48:52                                                                                  | 2023-07-30 23:14:47                                                                                  |
|  encoding                | {'default': {'encoding': 'Dictionary'}}                                                              | {'default': {'encoding': 'Dictionary'}}                                                              |
|  max_duration            | 60000000000                                                                                          | 60000000000                                                                                          |
|  max_runs                | -1                                                                                                   | -1                                                                                                   |
|  scale_factor            | 10.0                                                                                                 | 10.0                                                                                                 |
|  time_unit               | ns                                                                                                   | ns                                                                                                   |
|  use_prepared_statements | False                                                                                                | False                                                                                                |
|  using_scheduler         | False                                                                                                | False                                                                                                |
|  verify                  | False                                                                                                | False                                                                                                |
|  warmup_duration         | 0                                                                                                    | 0                                                                                                    |
+--------------------------+------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------+

+----------++----------+----------+---------++----------+----------+--------+---------+
| Item     || Latency (ms/iter)   |  Change || Throughput (iter/s) | Change | p-value |
|          ||      old |      new |         ||      old |      new |        |         |
+----------++----------+----------+---------++----------+----------+--------+---------+
| TPC-H 01 ||  4986.28 |  5014.15 |    +1%  ||     0.20 |     0.20 |   -1%  |  0.7839 |
| TPC-H 02 ||    39.17 |   457.95 | +1069%  ||    25.53 |     2.18 |  -91%  |  0.0000 |
| TPC-H 03 ||  2171.69 |  1858.86 |   -14%  ||     0.46 |     0.54 |  +17%  |  0.0000 |
| TPC-H 04 ||  1281.50 |  4011.32 |  +213%  ||     0.78 |     0.25 |  -68%  |  0.0000 |
| TPC-H 05 ||  2760.99 |  1899.43 |   -31%  ||     0.36 |     0.53 |  +45%  |  0.0000 |
| TPC-H 06 ||   143.29 |   143.42 |    +0%  ||     6.98 |     6.97 |   -0%  |  0.7099 |
| TPC-H 07 ||   859.95 |   935.47 |    +9%  ||     1.16 |     1.07 |   -8%  |  0.0000 |
| TPC-H 08 ||   673.33 |   837.96 |   +24%  ||     1.49 |     1.19 |  -20%  |  0.0000 |
| TPC-H 09 ||  5068.96 |  5833.68 |   +15%  ||     0.20 |     0.17 |  -13%  |  0.0000 |
| TPC-H 10 ||  2949.77 |  1638.95 |   -44%  ||     0.34 |     0.61 |  +80%  |  0.0000 |
| TPC-H 11 ||    70.92 |    82.15 |   +16%  ||    14.10 |    12.17 |  -14%  |  0.0000 |
| TPC-H 12 ||   927.65 |   889.56 |    -4%  ||     1.08 |     1.12 |   +4%  |  0.0000 |
| TPC-H 13 ||  5149.94 |  5200.85 |    +1%  ||     0.19 |     0.19 |   -1%  |  0.2178 |
| TPC-H 14 ||   419.95 |   446.31 |    +6%  ||     2.38 |     2.24 |   -6%  |  0.0000 |
| TPC-H 15 ||   187.23 |   189.93 |    +1%  ||     5.34 |     5.26 |   -1%  |  0.0000 |
| TPC-H 16 ||   641.90 |   726.86 |   +13%  ||     1.56 |     1.38 |  -12%  |  0.0000 |
| TPC-H 17 ||   203.64 |  3555.70 | +1646%  ||     4.91 |     0.28 |  -94%  |  0.0000 |
| TPC-H 18 ||  1616.40 |  1526.28 |    -6%  ||     0.62 |     0.66 |   +6%  |  0.0552 |
| TPC-H 19 ||   249.81 |   284.70 |   +14%  ||     4.00 |     3.51 |  -12%  |  0.0000 |
| TPC-H 20 ||   376.40 |   398.47 |    +6%  ||     2.66 |     2.51 |   -6%  |  0.0000 |
| TPC-H 21 ||  4347.86 | 11666.08 |  +168%  ||     0.23 |     0.09 |  -63%  |       ˅ |
| TPC-H 22 ||   409.04 |   407.78 |    -0%  ||     2.44 |     2.45 |   +0%  |  0.0243 |
+----------++----------+----------+---------++----------+----------+--------+---------+
| Sum      || 35535.66 | 48005.86 |   +35%  ||          |          |        |         |
| Geomean  ||          |          |         ||          |          |  -28%  |         |
+----------++----------+----------+---------++----------+----------+--------+---------+
|          || ˅ Insufficient number of runs for p-value calculation                   |
+----------++----------+----------+---------++----------+----------+--------+---------+


It does increase the performance compared to only using the DIP but still is no match for SJR.


Pointers for future work on DIPs

We found out that DIPs work particularly well whenever there is a prior selection on an attribute correlated to the join attribute. An example for this can be found in Query 21 of the TPC-H Benchmark, where a prior selection on "n_nation = 'Japan'" allows for an effective DIP on the join attribute "n_nationkey". Currently the cardinality estimation of DIPs is based on the SJR and does not take prior selections and data correlations into account. Besides only considering the use of DIPs if there is a prior selection on the reducer side, for which we already provide code as a comment, using range sets as introduced in the source paper could make DIPs a viable optimization rule. Range sets provide a closer estimation of the ranges of each attribute present in every partition of the table. This could not only put a stricter bound on the selection implemented by the DIP but also allow for a more exact and DIP-specific cardinality estimation. Additionaly it could be further evaluated if the between predicates can be pushed closer to the table scans to further optimize the cardinalities.

T4rikA avatar Jul 25 '23 07:07 T4rikA