tidyclust icon indicating copy to clipboard operation
tidyclust copied to clipboard

Consider partition data reduction algorithm

Open malcolmbarrett opened this issue 3 years ago • 3 comments

the Partition algorithm has several nice properties, particularly in high dimensions. It's also completely reproducible without a seed. https://github.com/USCbiostats/partition

malcolmbarrett avatar Jul 27 '22 15:07 malcolmbarrett

Correct me if I'm wrong, but isn't this method closer to a dimensionality reduction method?

EmilHvitfeldt avatar Mar 30 '23 20:03 EmilHvitfeldt

Now that you mention it, the clustering is really happening at the column level, not the row level like this package is focused on. Maybe better suited to a recipe somewhere?

malcolmbarrett avatar Mar 31 '23 15:03 malcolmbarrett

If a trained application can be reapplied to new data, then it would be perfect as a recipe step!

EmilHvitfeldt avatar Mar 31 '23 17:03 EmilHvitfeldt