timbre-dissimilarity-metrics
timbre-dissimilarity-metrics copied to clipboard
A collection of metrics for evaluating timbre dissimilarity using the TorchMetrics API
Timbre Dissimilarity Metrics
A collection of metrics for evaluating timbre dissimilarity using the TorchMetrics API. Work in progress and subject to sudden change — use in projects at your own risk.
Installation
pip install -e .
Usage
import timbremetrics
datasets = timbremetrics.list_datasets()
dataset = datasets[0] # get the first timbre dataset
# MAE between target dataset and pred embedding distances
metric = timbremetrics.TimbreMAE(
margin=0.0, dataset=dataset, distance=timbremetrics.l1
)
# get numpy audio for the timbre dataset
audio = timbremetrics.get_audio(dataset)
# get arbitrary embeddings for the timbre dataset's audio
embeddings = net(audio)
# compute the metric
metric(embeddings)
Metrics
The following metrics are implemented.
Mean Squared Error
Gives the mean squared error between the upper triangles of the predicted distance matrix and target distance matrix:
Mean Absolute Error
Gives the mean squared error between the upper triangles of the predicted distance matrix and target distance matrix:
Item Rank Agreement
Gives the proportion of distances ranked per-item that match between the predicted distance matrix and target distance matrix.
Where is the indicator function given by:
and &
are distances matrices ranked per item such that each row contains the ordinal distances from the corresponding item. We also provide a top-k version which computes this metric considering only the closest k items in each row.
Triplet Agreement
Samples pseudo-triplets from the target distance matrix according to a positivity radius and margin, and returns the proportion of these triplets for which ordering is retained in the predicted distance matrix, with the margin optionally enforced.
Triplet K-NN Agreement
For each anchor (a) from the target distance matrix , all triplets (a, i, j) are sampled where i and j are in a's K-nearest neighborhood and D(a, i) < D(a, j). The metric returns the proportion of these triplets for which ordering is retained in the predicted distance matrix
.
Mantel Test
The Mantel test computes Pearson's r or Spearman's rho on the condensed form of the upper triangles of the predicted and target distance matrices. The significance of the given result can be estimated using permutation analysis.
References
@article{thoret2021learning,
title={Learning metrics on spectrotemporal modulations reveals the perception of musical instrument timbre},
author={Thoret, Etienne and Caramiaux, Baptiste and Depalle, Philippe and Mcadams, Stephen},
journal={Nature Human Behaviour},
volume={5},
number={3},
pages={369--377},
year={2021},
publisher={Nature Publishing Group}
}
- original data source: https://github.com/EtienneTho/musical-timbre-studies