redflag icon indicating copy to clipboard operation
redflag copied to clipboard

Use energy distance to represent sample similarity

Open kwinkunks opened this issue 2 years ago • 0 comments

Could be another way to measure the similarity between datasets. From the twinning repo: https://github.com/avkl/twinning

energy() computes the energy distance (Székely & Rizzo, 2013) between a given dataset and a set of points in same dimensions. Energy distance is the metric minimized by twinning. The following code computes the energy distance between the synthetic dataset and a randomly drawn sample from it. Smaller the energy distance, the more statistically similar the sample is to the dataset.

Original paper: https://www.sciencedirect.com/science/article/pii/S0378375813000633

kwinkunks avatar Aug 08 '23 10:08 kwinkunks