EMMV_benchmarks
EMMV_benchmarks copied to clipboard
Understanding the parameters of `em` and `mv` function
Hi there
It is difficult for me to map the parameters of two functions em
and mv
em(t, t_max, volume_support, s_unif, s_X, n_generated)
mv(axis_alpha, volume_support, s_unif, s_X, n_generated)
to the description in the paper.
Might you give some hints?
Thanks
Hi,
sorry for the delay, and for this old unmaintained uncommented unclean code (I had not much experience in python at the time of writing)
- t is an array of levels, on which we want to evaluate EM_s(t) on samples X from an underlying density f.
- s_unif stands for s, evaluated on a uniform sample generated on a rectangle containing all the data X. This is used to estimate Leb(s>t)
- s_X stands for s(X), i.e s evaluated on a sample from the underlying density f. This is used to estimate P(s>t)
- n_generated is just the size of the uniform sample.
- volume_support is the volume of the rectangle containing all the data X.
- t_max should be called EM_min instead. We are only interested in the beginning of the EM curve.
hope that helps!
Hi Nicolas
Many thanks.
Mind you answer one more question - if you don't mind.
Suppose I have a dataframe (df) that contains the data points, and a list (abnormal_score
) that contains abnormal scores calculated by a model (One-class SVM for instance). Given that how should I call the function em
or mv
to calculate the score?
Thanks
You need more information about the one-class SVM decision_function. More precisely, you have a dataframe representing decision_function(X)
, and you also need the same dataframe but for uniform samples on the rectangle containing your data X, i.e. decision_function(uniform_sample)
.
With this two dataframes you should be able to compute em and mv as done in em_bench.py for numpy arrays.