Francesco Bruzzesi
Francesco Bruzzesi
Hey Dea, thanks for the input. That's what happens when I open issues in a rush. Let me try to clarify some points and ideas. My understanding is that one...
Commenting to discuss the idea: as plotly is understandably concerned about performances, maybe we could use the script they shared to assess if we have a performance drop
Thanks for reviving this @dangotbanned Last time we tried, we noticed that running tpch queries was taking way too long for how we want to iterate. At the same time,...
> Maybe we'd want to consider a tool that's more actively maintained than (https://github.com/nschloe/tuna)? I think codspeed is a very good option for public projects (it is used by pydantic...
I will leave it here even though I imagine that it is quite challenging: [`skrub`](https://github.com/skrub-data/skrub) would be very suitable use case
Couple more:  
> @FBruzzesi I am slightly confused, it seems that a lot of the ci fails don't have anything to do with the changes in this pr? can you confirm this?...
Just for context: the idea of implementing these queries is both for performance benchmarking and features availability. Meaning that if we are able to run all (most) of them, then...
**Findings from Q21**: - `DataFrame.join` doesn't have `on` keyword (I believe by design), I replaced it with `left_on=key, right_on=key`. Annoying but doable 😂 - `Expr` does not support `.len()` (opened...
**Findings from Q20** - There is a group by expression which is tagged as "complex", while in reality is a simple expression multiplied by a constant, namely `.agg((nw.col("l_quantity").sum() * 0.5).alias("sum_quantity"))`....