arrow icon indicating copy to clipboard operation
arrow copied to clipboard

GH-46897: [Docs][C++][Python] Fix asof join documentation

Open hadrian-reppas opened this issue 6 months ago • 2 comments

Rationale for this change

The asof join documentation is currently incorrect. Here is a copy of https://github.com/apache/arrow/issues/46897 for convenience:

There are two issues with the asof join docs:

  1. In the doc for the on parameter, it says "a row is considered a match if and only if left_on - tolerance <= right_on <= left_on." This is incorrect because a join with positive tolerance results in right_on values that are greater than or equal to left_on. Also, the inequality does not make sense for negative tolerances.
  2. In the doc for the tolerance parameter, it says "A right row is considered a match with the left row right.on - left.on <= tolerance." This does not mention that the difference must also be greater than or equal to 0. Also, the inequality is only correct for non-negative tolerances.

What changes are included in this PR?

This PR updates the asof join documentation for pyarrow.Table, pyarrow.Dataset and acero::AsofJoinNodeOptions.

Are these changes tested?

N/A

Are there any user-facing changes?

It updates the documentation.

  • GitHub Issue: #46897

hadrian-reppas avatar Jun 24 '25 20:06 hadrian-reppas

:warning: GitHub issue #46897 has been automatically assigned in GitHub to PR creator.

github-actions[bot] avatar Jun 24 '25 20:06 github-actions[bot]

@zanmato1984 do you have some time to review this?

raulcd avatar Jun 25 '25 08:06 raulcd

I'll take a look soon.

zanmato1984 avatar Jul 02 '25 02:07 zanmato1984

Yeah I like that a bit better. I also updated the Python docs with the new wording.

hadrian-reppas avatar Jul 07 '25 22:07 hadrian-reppas

CI failures are unrelated (seems like #47015). I'll merge later.

Thanks @hadrian-reppas a lot for fixing this!

zanmato1984 avatar Jul 08 '25 02:07 zanmato1984

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit e411851738c2528a5ce24857805e0eac57e2c659.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 11 possible false positives for unstable benchmarks that are known to sometimes produce them.