polars icon indicating copy to clipboard operation
polars copied to clipboard

perf: fast path for COUNT(*) queries

Open c-peters opened this issue 1 year ago • 2 comments

This implements a specialized Count function in our logical plan which does not materialize the entire dataframe when performing COUNT(*) from a data source.

fix #13521

c-peters avatar Feb 18 '24 16:02 c-peters

Codecov Report

Attention: Patch coverage is 95.74468% with 12 lines in your changes are missing coverage. Please review.

Project coverage is 80.93%. Comparing base (00e84fc) to head (1f59ce6).

Files Patch % Lines
crates/polars-arrow/src/io/ipc/read/file.rs 94.44% 4 Missing :warning:
...es/polars-plan/src/logical_plan/functions/count.rs 95.52% 3 Missing :warning:
crates/polars-io/src/csv/parser.rs 96.07% 2 Missing :warning:
...lars-plan/src/logical_plan/optimizer/count_star.rs 97.01% 2 Missing :warning:
...ates/polars-plan/src/logical_plan/functions/mod.rs 85.71% 1 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #14574      +/-   ##
==========================================
+ Coverage   80.81%   80.93%   +0.11%     
==========================================
  Files        1326     1328       +2     
  Lines      173208   173446     +238     
  Branches     2455     2455              
==========================================
+ Hits       139976   140373     +397     
+ Misses      32759    32601     -158     
+ Partials      473      472       -1     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Feb 18 '24 16:02 codecov[bot]

@ritchie46 , could you have a final look at the parallel csv part. If that was what you meant

c-peters avatar Feb 21 '24 12:02 c-peters

Finally optimized the most run query in the world. :sunglasses:

ritchie46 avatar Feb 24 '24 23:02 ritchie46