ibis
ibis copied to clipboard
refactor: simplify relational operators
Many logically distinct operations like filter, project, and aggregate are fused together into a single operation.
While this was useful initially for generating clean SQL, it has a number of downsides regarding maintenance and complexity of the library:
- The current design has led to a very tight coupling between the logical representation of an expression tree, and its optimization.
- It becomes difficult to maintain the behavior of these operators in isolation, because one must handle every aspect of a non-join relation when dealing with them in pretty much every context. There's no way to just think about a predicate.
To the end of simplifying the design and maintenance of ibis, I propose two related internal changes.
Proposal:
Separate Selection into distinct operators for each operation:
- Project
- Filter
- OrderBy
Whittle down Aggregation into a class with only metrics and by as instance members, and use where to implement the having API, and Filter and OrderBy to take the place of predicates and sort_keys respectively.
would for sure be +1 here
maybe could even do this w/o breaking the world (eg back compatible)
Yeah, I think this can be done in a backwards compatible way. The generated SQL for certain backends will almost certainly change though.
+1 to this. We internally depend on it currently being projection (isinstance checks) but not a huge deal to update to using the new classes.
@cpcloud is this issue still valid?
@lostmygithubaccount Yep, happening as part of #7580!
Done in the-epic-split!