BERTopic icon indicating copy to clipboard operation
BERTopic copied to clipboard

Modify reduce_dimensionality to use fit_transform

Open jemather opened this issue 8 months ago • 6 comments

Modifying _reduce_dimensionality to use fit_transform in single step, supporting the use of nearest neighbor descent when using cuML UMAP.

Fixes #2335

Before submitting

  • [ ] This PR fixes a typo or improves the docs (if yes, ignore all other checks!).
  • [X] Did you read the contributor guideline?
  • [X] Was this discussed/approved via a Github issue? Please add a link to it if that's the case.
  • Adding support for cuML's nn-descent with UMAP
  • [ ] Did you make sure to update the documentation with your changes (if applicable)?
  • [ ] Did you write any new necessary tests?

jemather avatar Apr 28 '25 21:04 jemather

Thanks for the PR! I'm seeing some errors popping up here and there. Could you take a look?

MaartenGr avatar May 06 '25 12:05 MaartenGr

@jemather do you want to look at the failures or would it be Ok for me to "take over" this PR to address the test failures?

betatim avatar May 13 '25 15:05 betatim

@betatim Sorry, started working on the changes and did not commit them. Let me see if I can get this done here in the next hour or so and if not happy to hand it over. Thank you for the nudge!

jemather avatar May 13 '25 15:05 jemather

@betatim took me a few tries, but think it should be set now.

jemather avatar May 13 '25 18:05 jemather

Cool beans! Happy that all it took was a nudge :D :D

betatim avatar May 14 '25 08:05 betatim

I don't have direct experience with BERTopic, coming to this as someone who works on cuml. Hence motivated to help move this PR along. @MaartenGr the diff looks reasonable, but I don't know much about the code base. Is there something I can help with regarding this PR or does it "just" need time from one of the maintainers to take a look at it and think about (I say "just" because maintainer time and brain power is one of the rarest resources on planet earth ;) )

betatim avatar May 15 '25 07:05 betatim

Looks like this might be ready to merge? Is there any way we could help with this PR?

csadorf avatar Aug 27 '25 16:08 csadorf

It's almost there. My comments that I made on June 6th (see above) weren't addressed just yet. If those are addressed and cover the mentioned use cases, then I can go ahead and merge.

MaartenGr avatar Aug 27 '25 17:08 MaartenGr

@jemather I picked up this PR to implement the outstanding review comments in #2416. I hope this is fine with you. I kept all the original commits from this PR, so that you get credit for your work.

betatim avatar Aug 29 '25 15:08 betatim

Closing this as it was picked up and merged in #2416

MaartenGr avatar Sep 25 '25 12:09 MaartenGr