perf(pandas): `Alias` executes its children twice
On the Pandas backend, Alias executes the expression it is renaming (and the tree under it) twice
See this notebook
@timothydijamco We're going to release 3.1 tomorrow (July 22, 2022) so I don't think a fix for this will make it in! However, definitely for 4.0.0 or perhaps a bugfix 3.1.1 in between.
Given the large changes for 4.0, we'd strongly prefer this in a 3.1.1 patch version if possible.
@gerrymanoim That makes sense! Happy to review PRs for this. Is this something that's high priority for y'all?
thanks! It's somewhat significant for us. I think I have a fix and can open a PR this week.
I believe this just needs to be changed to return data (data being the second argument) but I'll check it against the Ibis tests and add a re-execution test
https://github.com/ibis-project/ibis/blob/00c83e6912263d58cbf04d7904bc0e96ccb03af0/ibis/backends/pandas/execution/generic.py#L926-L930
@timothydijamco Sounds good! I'll review the PR as soon as it's up.
Thanks! We're doing another round of updates/patches, my main point was that we want to get these fixes relating to 3.0->3.1 in before we try to tackle the breaking changes slated for 4.0.
Yep that makes sense. We'll do a 3.2 release before 4.0 lands.
@gerrymanoim We'd like to release this week. Any chance of getting this fix in for 3.2? Otherwise, we'll have to leave it for 4.0.
@timothydijamco if u have a chance to do this
@gerrymanoim is out of town
@jreback Thanks!
yup, sorry my Ibis dev environment has been in a bad state but I might be able to do this blind. I'm out of town as well but I pushed what I have and will take a quick look again later today.