pyjanitor
pyjanitor copied to clipboard
non-equi join rewrite numba
PR Description
Please describe the changes proposed in the pull request:
- reimplementation of the algorithm that powers the non-equi join in numba
- more info in the docs regarding range joins, and where numba can be helpful
- improved performance for numba for range joins and multiple non equi joins when getting first/last match
This PR improves conditional_join.
PR Checklist
Please ensure that you have done the following:
- [x] PR in from a fork off your branch. Do not PR from
<your_username>:dev, but rather from<your_username>:<feature-branch_name>.
- [x] If you're not on the contributors list, add yourself to
AUTHORS.md.
- [x] Add a line to
CHANGELOG.mdunder the latest version header (i.e. the one that is "on deck") describing the contribution.- Do use some discretion here; if there are multiple PRs that are related, keep them in a single line.
Automatic checks
There will be automatic checks run on the PR. These include:
- Building a preview of the docs on Netlify
- Automatically linting the code
- Making sure the code is documented
- Making sure that all tests are passed
- Making sure that code coverage doesn't go down.
Relevant Reviewers
Please tag maintainers to review.
- @ericmjl
🚀 Deployed on https://deploy-preview-1341--pyjanitor.netlify.app
Codecov Report
Attention: Patch coverage is 96.44269% with 9 lines in your changes are missing coverage. Please review.
Project coverage is 95.87%. Comparing base (
62c57c6) to head (d7aafb9). Report is 10 commits behind head on dev.
:exclamation: Current head d7aafb9 differs from pull request most recent head 4fca4c0
Please upload reports for the commit 4fca4c0 to get more accurate results.
Additional details and impacted files
@@ Coverage Diff @@
## dev #1341 +/- ##
==========================================
+ Coverage 94.48% 95.87% +1.39%
==========================================
Files 80 80
Lines 4367 4197 -170
==========================================
- Hits 4126 4024 -102
+ Misses 241 173 -68
@samukweku is this one ready for review? On GitHub it looks like it's still being marked as a draft.
@ericmjl not yet, still got some optimisations that I am hopeful to pull off
@ericmjl not familiar with the jitter function. can i get your help with the failing test function. thanks
@ericmjl not familiar with the jitter function. can i get your help with the failing test function. thanks
Not a problem, @samukweku. I encountered a similar issue when testing locally so I decided to make the test a bit more lenient.
@samukweku, apologies for the long turnaround time here. I think we should merge this in, as (a) the PR has been open for a long time, and (b) I've already committed to the branch. Once the CI checks are done, I'll do a merge!
@samukweku I have merged! Thank you for taking care of pyjanitor, I really appreciate it!