ibis icon indicating copy to clipboard operation
ibis copied to clipboard

docs: description of "predicates" arg of join method is unclear

Open JonAnCla opened this issue 1 year ago • 4 comments

Please describe the issue

There are two points I'd like to highlight:

  1. In https://ibis-project.org/reference/expression-tables#ibis.expr.types.relations.Table.join, the description of arg "predicate" says to look at the examples, but i) there are lots of examples to read through to check if what you want might be supported, and ii) the name "predicate" implied to me some boolean condition was required. So my suggestion is that the overloading of "predicate" be explained in the description e.g. i) it could be column name(s) to join on ii) it could be a condition on which to join on, iii) it could be "True" meaning cross join (from what I understand of #6696) etc
  2. I found the argument name "predicate" confusing e.g. in comparison to SQL, pandas and polars etc which use "on" as the equivalent argument name.

e.g. in code that looks like this, I find it not very obvious that the intention is to join on col_z, because predicate is an usual name for that arg

table1.join(table2, predicate = 'col_z')

Could "on" be added as an alternative argname for predicate? Current alternative is to present 'col_z' as an arg rather than a kwarg, but I like the clarity given by using a kwarg

Thanks & sorry for the collection of issues - just getting into the library and bringing my annoying opinions :)

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

JonAnCla avatar Jul 05 '24 12:07 JonAnCla

Changing to on would be an improvement IMO, but pretty annoying for end users at this point since its an API that's been around for quite some time and so I'm not sure it's worth the churn.

Definite +1 to adding a few more details to the predicate docstring to make it easier to find what you need!

cpcloud avatar Jul 05 '24 16:07 cpcloud

For clarity, predicate is referring to the general, desugared form. The pandas-style syntax is syntax sugar around an equality predicate.

cpcloud avatar Jul 05 '24 16:07 cpcloud

Changing to on would be an improvement IMO, but pretty annoying for end users at this point since its an API that's been around for quite some time and so I'm not sure it's worth the churn.

Wonder if we could expose both in the interim, and drop the predicates alias in the future? Or even just keep it (just as you can aggregate or agg, but change all the docs to use on instead)?

deepyaman avatar Jul 05 '24 17:07 deepyaman

Thanks both, my vote would be to:

  1. add "on" as a kwarg & keep "predicates" as an alias of "on"
  2. switch docs to show "on" instead of "predicates" that
  3. [optional] eventually deprecate end then remove "predicates"

I think 1 could be done first, or 1&2, and then come back to 3?

JonAnCla avatar Jul 08 '24 11:07 JonAnCla