amazon-denseclus
amazon-denseclus copied to clipboard
UMAP hyperparameters question
We have three bunches of hyperparameters of UMAP. When should we use categorical and numerical dicts, and when combined dict?
default_umap_params = { "categorical": { # jaccard is an option but only takes sparse input "metric": "hamming", "n_neighbors": 30, "n_components": 5, "min_dist": 0.0, }, "numerical": { "metric": "l2", "n_neighbors": 30, "n_components": 5, "min_dist": 0.0, }, "combined": { "n_neighbors": 30, "min_dist": 0.0, "n_components": 5, }, }
@timofeytkachenko Ideally, your use case won't require having to go to deep in the weeds and can use the presets. "combined"
is set when it creates an intersection_union_mapper
mapper like here and thus needs to fit a third UMAP over the other two. Regardless, it's fitting two using numerical
and categorical
. We've got another NB coming that shows these that might shed more light but please use the first one until then. FYI @bharven @srushtii-aws