David-Elias Künstle
David-Elias Künstle
An easy mistake in the ordinal embedding estimators and in the estimate_dimensionality_cv function is to add dimensions (components) < 1 (e.g., #71). This results in crude shape errors. Instead, we...
Here we keep track of some comparison-based algorithms that could be interesting to implement. - [ ] Clustering: Crowd-median algorithm (Heikinheimo & Ukkonen, 2013) - [ ] Embedding: Multi-view embedding...
Comparison queries can ask different questions: * triplet: d(A, B) < d(A, C) * quadrupled: d(A, B) < d(C, D) * odd-one-out: d(B, C) < d(A, B) & d(A, C)...
Where ever we use distance metrics (usually Euclidean), make this configurable. Default euclidean. Allow string or function argument and additional parameters.+ Minimum: Minkowski distance, at best every metric of pdist...
Add utility to align and standardize multiple embeddings, e.g. with Procrustes' method. Add utility to bootstrap queries with any OE algorithm to create aligned embedding
Common metrics in the field are: - k-NN accuracy - (root) Mean Squared Error of embedding- but align with Procrustes - Procrustes disparity - correlation / MSE / ... of...
Both the dataset functions themself (...matrix, ...triplets, ...distances, ...similarities?) , but also the result object's arguments (e.g. triplet or data? singular or plural?)
Separate the index sampling and responses to avoid the argument chains that we currently have (and which would get worse): - index generation (e.g. uniform, k-NN, radius, custom; from embedding...