tidyclust
tidyclust copied to clipboard
Consider partition data reduction algorithm
the Partition algorithm has several nice properties, particularly in high dimensions. It's also completely reproducible without a seed. https://github.com/USCbiostats/partition
Correct me if I'm wrong, but isn't this method closer to a dimensionality reduction method?
Now that you mention it, the clustering is really happening at the column level, not the row level like this package is focused on. Maybe better suited to a recipe somewhere?
If a trained application can be reapplied to new data, then it would be perfect as a recipe step!