pandas icon indicating copy to clipboard operation
pandas copied to clipboard

ENH: Implemented MultiIndex.searchsorted method ( GH14833)

Open GSAUC3 opened this issue 7 months ago • 12 comments

  • [X] closes #14833
  • [X] Tests added and passed if fixing a bug or adding a new feature
  • [X] All code checks passed.
  • [X] Added type annotations to new arguments/methods/functions.
  • [x] Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

GSAUC3 avatar May 12 '25 17:05 GSAUC3

Thanks @GSAUC3 for the PR. Is there and issue, or has there been any discussion about this elsewhere?

datapythonista avatar May 12 '25 17:05 datapythonista

Hi @datapythonista, this was the issue https://github.com/pandas-dev/pandas/issues/14833 , against which i made the pull request.

GSAUC3 avatar May 12 '25 17:05 GSAUC3

HI, I had ran pytest and pre-commit locally, is it possible to run all these test locally?

GSAUC3 avatar May 12 '25 19:05 GSAUC3

@GSAUC3 Add a return statement in your function after except block.

  • check teh docstring and proper typing hints, IndexOpsMixin check this class, where searchsorted has a properly defined parameter structure.

RB-zentronlabs avatar May 13 '25 07:05 RB-zentronlabs

hi, @datapythonista , I am having trouble running all these tests, locally, before committing my code, so far i have only ran the pytest, locally, and that worked, could you please, guide me, how to set up the testing environment locally, before each commit?

will running pre-commit run --all-files suffice?

GSAUC3 avatar May 15 '25 00:05 GSAUC3

pre-commit should run automatically if it's set up to work as intended. You have all the information on how to set up the development environment, run tests... in the development documentation: https://pandas.pydata.org/docs/development/index.html

datapythonista avatar May 15 '25 08:05 datapythonista

Hi, @datapythonista, this part of the error messages tells, us that searchsorted method should fail, but it is passing, am i correct?

=================================== FAILURES ===================================
__________________________ test_searchsorted[tuples] ___________________________
[gw0] darwin -- Python 3.10.17 /Users/runner/micromamba/envs/test/bin/python3.10
[XPASS(strict)] np.searchsorted doesn't work on pd.MultiIndex: GH 1[48](https://github.com/pandas-dev/pandas/actions/runs/15033468287/job/42250710830?pr=61435#step:5:52)33
___________________ test_searchsorted[mi-with-dt64tz-level] ____________________
[gw0] darwin -- Python 3.10.17 /Users/runner/micromamba/envs/test/bin/python3.10
[XPASS(strict)] np.searchsorted doesn't work on pd.MultiIndex: GH 14833
___________________________ test_searchsorted[multi] ___________________________

RB-zentronlabs avatar May 15 '25 08:05 RB-zentronlabs

Hi, @datapythonista, this part of the error messages tells, us that searchsorted method should fail, but it is passing, am i correct?

Yes, that's correct. I guess we have an xfail for the test that should be removed.

datapythonista avatar May 15 '25 08:05 datapythonista

Hi @datapythonista . Thank you for your suggestions, I've addressed the feedback from earlier and the CI checks are now passing. This PR should be ready for review whenever you get a chance. Please let me know if any changes are required. Thanks again!

GSAUC3 avatar May 17 '25 17:05 GSAUC3

Hi @datapythonista, I hope you're doing well. Apologies if this is a basic question—I'm still relatively new to open source, and I noticed that the pull request now shows an “outdated” tag on some of the files I contributed to. I'm not entirely sure what that means. Should I be concerned about it? Should i update the branch? Thanks in advance for your guidance!

GSAUC3 avatar May 21 '25 16:05 GSAUC3

Hi @mroeschke, thank you for the review and helpful feedback.

I understand that ExtensionArray currently only supports 1D data, and making it work with 2D inputs would likely take some deeper changes.

If the long-term goal is to update algorithms.searchsorted to support 2D inputs and dispatch to the array — so that ExtensionArray can benefit automatically — I’d be happy to help with that.

Please let me know how you’d like to move forward. I’d be glad to contribute to any changes or help explore what’s needed.

GSAUC3 avatar May 28 '25 02:05 GSAUC3

Hi @mroeschke, I've made the required changes. I had a question; would it be appropriate to implement this using binary search? I already have a working implementation ready, and I'm happy to push it if that's the recommended approach. Let me know what you think!

GSAUC3 avatar Jun 03 '25 15:06 GSAUC3

Hi @datapythonista and @mroeschke 👋,

I hope you're both doing well! Just a gentle reminder regarding PR #61435. I've addressed the requested changes and would appreciate it if you could take a look when you have a moment. Please let me know if any further modifications are needed.

Thank you very much for your time and guidance!

GSAUC3 avatar Jun 19 '25 17:06 GSAUC3

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

github-actions[bot] avatar Jul 20 '25 00:07 github-actions[bot]

@datapythonista Hi, hope you are doing well, would you mind reviewing this pull request please?

GSAUC3 avatar Jul 20 '25 02:07 GSAUC3