polars
polars copied to clipboard
feat(rust, python): Add `top_k` and `bottom_k` in the `GroupBy` namespace
Closes #10054
Return the k top/bottom rows sorted by given order in each group.
Example
Rust
Only implemented for
LazyGroupBy.
let df = df![
"a" => &[1, 2, 2, 3, 4, 5],
"b" => &[5.5, 0.5, 4.0, 10.0, 13.0, 17.0],
"c" => &[true, true, true, false, false, true],
"d" => &["Apple", "Orange", "Apple", "Apple", "Banana", "Banana"],
].unwrap();
println!(
"{:?}",
df.lazy().group_by_stable(&[col("d")])
.bottom_k(2, &[col("b")], [true])
.collect()
.unwrap()
);
Output:
shape: (5, 4)
┌────────┬─────┬──────┬───────┐
│ d ┆ a ┆ b ┆ c │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ i32 ┆ f64 ┆ bool │
╞════════╪═════╪══════╪═══════╡
│ Apple ┆ 3 ┆ 10.0 ┆ false │
│ Apple ┆ 1 ┆ 5.5 ┆ true │
│ Orange ┆ 2 ┆ 0.5 ┆ true │
│ Banana ┆ 5 ┆ 17.0 ┆ true │
│ Banana ┆ 4 ┆ 13.0 ┆ false │
└────────┴─────┴──────┴───────┘
Python
Implemented for both
LazyGroupByandGroupBy.
df = pl.DataFrame(
{
"a": [1, 2, 2, 3, 4, 5],
"b": [5.5, 0.5, 4, 10, 13, 17],
"c": [True, True, True, False, False, True],
"d": ["Apple", "Orange", "Apple", "Apple", "Banana", "Banana"],
}
)
df.group_by("d", maintain_order=True).bottom_k(2, by="b")
Output:
shape: (5, 4)
┌────────┬─────┬──────┬───────┐
│ d ┆ a ┆ b ┆ c │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ f64 ┆ bool │
╞════════╪═════╪══════╪═══════╡
│ Apple ┆ 2 ┆ 4.0 ┆ true │
│ Apple ┆ 1 ┆ 5.5 ┆ true │
│ Orange ┆ 2 ┆ 0.5 ┆ true │
│ Banana ┆ 4 ┆ 13.0 ┆ false │
│ Banana ┆ 5 ┆ 17.0 ┆ true │
└────────┴─────┴──────┴───────┘
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 81.33%. Comparing base (
2fca551) to head (eab8cea). Report is 1 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #15263 +/- ##
==========================================
+ Coverage 81.31% 81.33% +0.01%
==========================================
Files 1359 1359
Lines 176083 176163 +80
Branches 2524 2536 +12
==========================================
+ Hits 143188 143280 +92
+ Misses 32411 32399 -12
Partials 484 484
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Sorry for the confusion. See comment: https://github.com/pola-rs/polars/issues/10054#issuecomment-2025127965