ENH: Implemented MultiIndex.searchsorted method ( GH14833)
- [X] closes #14833
- [X] Tests added and passed if fixing a bug or adding a new feature
- [X] All code checks passed.
- [X] Added type annotations to new arguments/methods/functions.
- [x] Added an entry in the latest
doc/source/whatsnew/vX.X.X.rstfile if fixing a bug or adding a new feature.
Thanks @GSAUC3 for the PR. Is there and issue, or has there been any discussion about this elsewhere?
Hi @datapythonista, this was the issue https://github.com/pandas-dev/pandas/issues/14833 , against which i made the pull request.
HI, I had ran pytest and pre-commit locally, is it possible to run all these test locally?
@GSAUC3 Add a return statement in your function after except block.
- check teh docstring and proper typing hints,
IndexOpsMixincheck this class, where searchsorted has a properly defined parameter structure.
hi, @datapythonista , I am having trouble running all these tests, locally, before committing my code, so far i have only ran the pytest, locally, and that worked, could you please, guide me, how to set up the testing environment locally, before each commit?
will running
pre-commit run --all-files
suffice?
pre-commit should run automatically if it's set up to work as intended. You have all the information on how to set up the development environment, run tests... in the development documentation: https://pandas.pydata.org/docs/development/index.html
Hi, @datapythonista, this part of the error messages tells, us that searchsorted method should fail, but it is passing, am i correct?
=================================== FAILURES ===================================
__________________________ test_searchsorted[tuples] ___________________________
[gw0] darwin -- Python 3.10.17 /Users/runner/micromamba/envs/test/bin/python3.10
[XPASS(strict)] np.searchsorted doesn't work on pd.MultiIndex: GH 1[48](https://github.com/pandas-dev/pandas/actions/runs/15033468287/job/42250710830?pr=61435#step:5:52)33
___________________ test_searchsorted[mi-with-dt64tz-level] ____________________
[gw0] darwin -- Python 3.10.17 /Users/runner/micromamba/envs/test/bin/python3.10
[XPASS(strict)] np.searchsorted doesn't work on pd.MultiIndex: GH 14833
___________________________ test_searchsorted[multi] ___________________________
Hi, @datapythonista, this part of the error messages tells, us that searchsorted method should fail, but it is passing, am i correct?
Yes, that's correct. I guess we have an xfail for the test that should be removed.
Hi @datapythonista . Thank you for your suggestions, I've addressed the feedback from earlier and the CI checks are now passing. This PR should be ready for review whenever you get a chance. Please let me know if any changes are required. Thanks again!
Hi @datapythonista, I hope you're doing well. Apologies if this is a basic question—I'm still relatively new to open source, and I noticed that the pull request now shows an “outdated” tag on some of the files I contributed to. I'm not entirely sure what that means. Should I be concerned about it? Should i update the branch? Thanks in advance for your guidance!
Hi @mroeschke, thank you for the review and helpful feedback.
I understand that ExtensionArray currently only supports 1D data, and making it work with 2D inputs would likely take some deeper changes.
If the long-term goal is to update algorithms.searchsorted to support 2D inputs and dispatch to the array — so that ExtensionArray can benefit automatically — I’d be happy to help with that.
Please let me know how you’d like to move forward. I’d be glad to contribute to any changes or help explore what’s needed.
Hi @mroeschke, I've made the required changes. I had a question; would it be appropriate to implement this using binary search? I already have a working implementation ready, and I'm happy to push it if that's the recommended approach. Let me know what you think!
Hi @datapythonista and @mroeschke 👋,
I hope you're both doing well! Just a gentle reminder regarding PR #61435. I've addressed the requested changes and would appreciate it if you could take a look when you have a moment. Please let me know if any further modifications are needed.
Thank you very much for your time and guidance!
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.
@datapythonista Hi, hope you are doing well, would you mind reviewing this pull request please?