risinglight icon indicating copy to clipboard operation
risinglight copied to clipboard

executor: better explain for physical plans

Open skyzh opened this issue 2 years ago • 6 comments

Currently, RisingLight's explain interface is not really "explainable". There are some executors that...

  • Use Rust's default std::fmt::Debug when explaining
  • Doesn't print out all information
  • Output too long line
  • ...

We should fix those plans and make the explain UI more pretty.

A (previous) bad example is TPC-H Q1:

  • PhysicalProjection's output is too long. Maybe we should insert new lines for exprs?
  • PhysicalHashAgg doesn't print out its aggregation functions.
> explain select
    l_returnflag,
    l_linestatus,
    sum(l_quantity) as sum_qty,
    sum(l_extendedprice) as sum_base_price,
    sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
    sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
    avg(l_quantity) as avg_qty,
    avg(l_extendedprice) as avg_price,
    avg(l_discount) as avg_disc,
    count(*) as count_order
from
    lineitem
where
    l_shipdate <= date '1998-12-01' - interval '71' day
group by
    l_returnflag,
    l_linestatus
order by
    l_returnflag,
    l_linestatus;
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| PhysicalOrder: [InputRef #0 (asc), InputRef #1 (asc)]                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|   PhysicalProjection: exprs [InputRef #0, InputRef #1, InputRef #2 (alias to sum_qty), InputRef #3 (alias to sum_base_price), InputRef #4 (alias to sum_disc_price), InputRef #5 (alias to sum_charge), Divide(InputRef #6, InputRef #7 as Decimal(Some(15), Some(2))) (alias to avg_qty), Divide(InputRef #8, InputRef #9 as Decimal(Some(15), Some(2))) (alias to avg_price), Divide(InputRef #10, InputRef #11 as Decimal(Some(15), Some(2))) (alias to avg_disc), InputRef #12 (alias to count_order)] |
|     PhysicalHashAgg: 11 agg calls                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
|       PhysicalTableScan: table #7, columns [10, 8, 9, 4, 5, 6, 7, 0], with_row_handler: false, is_sorted: false, expr: LtEq(InputRef #0, Date(Date(10490)) (const))                                                                                                                                                                                                                                                                                                                                        |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

This is relatively an easy task to do. If you have interest in improving the explain output, feel free to send PRs!

What we need to do is simply change the Display implementation:

impl fmt::Display for PhysicalFilter {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        writeln!(f, "PhysicalFilter: expr {:?}", self.logical().expr())
    }
}

... to match the format of other queries. Like:

PhysicalOrder: [InputRef #0 (asc), InputRef #1 (asc)]
  PhysicalProjection:
      InputRef #0
      InputRef #1
      InputRef #2 (alias to sum_qty)
      InputRef #3 (alias to sum_base_price)
      InputRef #4 (alias to sum_disc_price)
      InputRef #5 (alias to sum_charge)
      (InputRef #6 / InputRef #7) (alias to avg_qty)
      (InputRef #8 / InputRef #9) (alias to avg_price)
      (InputRef #10 / InputRef #11) (alias to avg_disc)
      InputRef #12 (alias to count_order)
    PhysicalHashAgg:
        sum(InputRef #3) -> NUMERIC(15,2)
        sum(InputRef #4) -> NUMERIC(15,2)
        sum((InputRef #4 * (1 - InputRef #5))) -> NUMERIC(15,2) (null)
        sum(((InputRef #4 * (1 - InputRef #5)) * (1 + InputRef #6))) -> NUMERIC(15,2) (null)
        sum(InputRef #3) -> NUMERIC(15,2)
        count(InputRef #3) -> INT
        sum(InputRef #4) -> NUMERIC(15,2)
        count(InputRef #4) -> INT
        sum(InputRef #5) -> NUMERIC(15,2)
        count(InputRef #5) -> INT
        count(InputRef #0) -> INT
      PhysicalTableScan:
          table #7,
          columns [10, 8, 9, 4, 5, 6, 7],
          with_row_handler: true,
          is_sorted: false,
          expr: LtEq(InputRef #0, Date(Date(10490)) (const))
  • [x] HashAgg, Projection and TableScan https://github.com/risinglightdb/risinglight/pull/469
  • [ ] Drop
  • [ ] Filter
  • [ ] Order
  • [x] HashJoin
  • [ ] NestedLoopJoin
  • [ ] TopN
  • [ ] Values

skyzh avatar Feb 05 '22 07:02 skyzh

@nanderstabel will take this issue. Thanks for your contribution!

skyzh avatar Feb 14 '22 10:02 skyzh

Opened a pr here: #469

nanderstabel avatar Feb 14 '22 23:02 nanderstabel

Currently, the NLJ will output:

> create table t1(v1 int not null);
created
in 22ms 578us 583ns
> create table t2(v1 int not null);
created
in 22ms 385us 875ns
> explain select * from t1, t2 where true;
PhysicalProjection:
    InputRef #0
    InputRef #1
  PhysicalNestedLoopJoin: op Inner, predicate: Bool(true) (const)
    PhysicalTableScan:
        table #8,
        columns [0],
        with_row_handler: false,
        is_sorted: false,
        expr: None
    PhysicalTableScan:
        table #9,
        columns [0],
        with_row_handler: false,
        is_sorted: false,
        expr: None

while order by will output:

> explain select * from t1 order by v1;
PhysicalOrder: [InputRef #0 (asc)]
  PhysicalProjection:
      InputRef #0
    PhysicalTableScan:
        table #8,
        columns [0],
        with_row_handler: false,
        is_sorted: false,
        expr: None

We should change OrderBy and NLJ to the multi-line format.

cc @WindowsXp-Beta will take over this, thanks!

skyzh avatar Feb 23 '22 05:02 skyzh

Hi @skyzh , I am relatively new to Rust and RisingLight, but I am very interested in contributing to the project!

Currently, I think this issue has been resolved for

  • OrderBy
  • NLJ
  • HashAgg, Projection, TableScan
  • HashJoin

How about the rest like

  • Drop
  • Filter
  • TopN
  • Values

Can I kindly check if we have some target (improved) output for their displays?

Many thanks!

litone01 avatar Jul 30 '22 10:07 litone01

I think what you've mentioned can be improved to show as much information as possible, so you can have a try if you want!

skyzh avatar Jul 30 '22 11:07 skyzh

I think what you've mentioned can be improved to show as much information as possible, so you can have a try if you want!

Sure! I will have a closer look at the related methods for Drop, Filter, TopN, Values and give it a try:) Do allow me to take a few days to understand the context and codebase a bit more in-depth.

Thanks again!

litone01 avatar Jul 30 '22 11:07 litone01

Hi, @skyzh @wangrunji0408 let me try drop. ps: I may need a little time, I'm a beginner in rust

caicancai avatar Nov 09 '23 14:11 caicancai

Hi, @skyzh let me try drop

Apologies that I didn't have the bandwidth to complete this. Pls feel free to assign to another person. Thanks.

litone01 avatar Nov 09 '23 14:11 litone01

@caicancai Just go ahead!

@litone01 Never mind. You don't need to apologize. This is normal in open source and we are human beings.

xxchan avatar Nov 09 '23 14:11 xxchan

@caicancai Just go ahead!

@litone01 Never mind. You don't need to apologize. This is normal in open source and we are human beings.

> explain select * from t1, t2 where true;
+----------------------------------------------------------------------+
| Join { type: inner, cost: 0, rows: 0 }                               |
| ├── Scan { table: t1, list: [ v1 ], filter: null, cost: 0, rows: 0 } |
| └── Scan { table: t2, list: [ v1 ], filter: null, cost: 0, rows: 0 } |
+----------------------------------------------------------------------+

Hello, I found that this part of the overall code seems to be refactured, this task seems to be no longer applicable to the current code structure, the presentation of the entire physical plan is not consistent with the issue description

caicancai avatar Nov 13 '23 07:11 caicancai

Indeed. Seems improved a lot by #771. Sorry!

xxchan avatar Nov 13 '23 08:11 xxchan