planning
planning copied to clipboard
Tidyclust
This is my writeup with some details about clustering (unsupervised learning) and how I envision that fitting into the tidymodels framework.
Just wanted to comment that for unsupervised methods there is often both a forward and a backward transformation. Less so with clustering, but for many PCA-like tools. I previously brought this up at little in tidymodels/recipes#264.
Another possible consideration before starting to prototype things is the difference between inductive/transductive models, or methods that can be applied to a new dataset versus those that cannot.
Heyo! really excited to see this developing. I've recently fallen into the world of unsupervised clustering (via some gnar text projects) and have been having a hard time understanding the literature and not being able to find certain methods in tidymodels.
Very cool! Looking forward to developments here and this functionality coming to tidymodels!
I posted a toy solution on SO for validating kmeans cluster partition stability on a holdout set: https://stackoverflow.com/a/68845111/9059865 . (For anyone stumbling onto this thread and looking for something simple in the interim before {celery} 😊 gets implemented in tidymodels.)
Closing this as we already this this :)