datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Wrong metric when using EXPLAIN ANALYZE

Open Ted-Jiang opened this issue 3 years ago • 1 comments

Describe the bug submit a query


❯ select sum("30"), "29" from test1 group by "29";

198683 rows in set. Query took 0.436 seconds.

then use explain analyze

explain analyze  select sum("30"), "29" from test1 group by "29";
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type         | plan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Plan with Metrics | CoalescePartitionsExec, metrics=[output_rows=198683, elapsed_compute=47.871µs, spill_count=0, spilled_bytes=0, mem_used=0]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
|                   |   ProjectionExec: expr=[SUM(test1.30)@1 as SUM(test1.30), 29@0 as 29], metrics=[output_rows=198683, elapsed_compute=103.299µs, spill_count=0, spilled_bytes=0, mem_used=0]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
|                   |     AggregateExec: mode=FinalPartitioned, gby=[29@0 as 29], aggr=[SUM(test1.30)], metrics=[output_rows=198683, elapsed_compute=514.502772ms, spill_count=0, spilled_bytes=0, mem_used=0]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
|                   |       CoalesceBatchesExec: target_batch_size=4096, metrics=[output_rows=972749, elapsed_compute=91.069241ms, spill_count=0, spilled_bytes=0, mem_used=0]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
|                   |         RepartitionExec: partitioning=Hash([Column { name: "29", index: 0 }], 100), metrics=[fetch_time{inputPartition=0}=5.544007548s, repart_time{inputPartition=0}=75.322775ms, send_time{inputPartition=0}=14.473058ms]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
|                   |           AggregateExec: mode=Partial, gby=[29@1 as 29], aggr=[SUM(test1.30)], metrics=[output_rows=972749, elapsed_compute=1.915277419s, spill_count=0, spilled_bytes=0, mem_used=0]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
|                   |             RepartitionExec: partitioning=RoundRobinBatch(100), metrics=[send_time{inputPartition=0}=243.407µs, fetch_time{inputPartition=0}=43.401471ms, repart_time{inputPartition=0}=1ns]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                   |               ParquetExec: limit=None, projection=[30, 29], metrics=[output_rows=1000202, elapsed_compute=13.253088ms, spill_count=0, spilled_bytes=0, mem_used=0, num_predicate_creation_errors=0, row_groups_pruned{filename=kylin/rno_native_dev/kylin_metadata_test/ssb/parquet/kylin_udaf_cube/19900101000000_20000101000000_ACG/17179869183/part-00005-84c1f179-eda5-4947-86ca-3d2497b485c4-c000.snappy.parquet}=0, bytes_scanned{filename=kylin/rno_native_dev/kylin_metadata_test/ssb/parquet/kylin_udaf_cube/19900101000000_20000101000000_ACG/17179869183/part-00005-84c1f179-eda5-4947-86ca-3d2497b485c4-c000.snappy.parquet}=5959033, predicate_evaluation_errors{filename=kylin/rno_native_dev/kylin_metadata_test/ssb/parquet/kylin_udaf_cube/19900101000000_20000101000000_ACG/17179869183/part-00005-84c1f179-eda5-4947-86ca-3d2497b485c4-c000.snappy.parquet}=0] |
|                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set. Query took 0.093 seconds.

the first elapsed_compute=1.915277419s in AGG is even longer than the query cose 😂

To Reproduce Steps to reproduce the behavior:

Expected behavior A clear and concise description of what you expected to happen.

Additional context Add any other context about the problem here.

Ted-Jiang avatar Aug 01 '22 13:08 Ted-Jiang

I think elapsed_compute is meant to capture how much cpu time was taken -- so if the query ran on 16 cores for 1 second of wall clock time, then I would expect that elapsed_compute would be reported as 16 sec

That being said, it is hard to imagine how 0.093 seconds query time could use 1.9 sec of compute 🤔

alamb avatar Aug 03 '22 13:08 alamb