riskloc icon indicating copy to clipboard operation
riskloc copied to clipboard

Question about the value of "n_remove" in riksloc

Open ZhihuangLi1221 opened this issue 1 year ago • 3 comments

Hi,

I hope you are well.

When I used riskloc in my dataset, I noticed that it can precisely found the root cause. However, my purpose is to find those anomalies that occur more frequently, so I would consider those rare root causes I found would be some outliers. Then I tried to increase the value of "n_remove" , but still not got my expected result.

Also, when I decrease the "n_remove" to 1, the "cutoff" value shifted a lot, and the output return null. When I do the same thing in another dataset, the result was not affected. I compared the distributions of measurements of 2 datasets, the first one is more like normal distribution, the second one is like long-tailed distribution.

Here are my questions:

  1. Is adjusting n_remove a way to do what I expect? If yes, is there some more reliable way than setting constants arbitrarily?
  2. Does the distribution of the measurements range affect the performance of the algorithm?

I am looking forward to your reply.

ZhihuangLi1221 avatar Nov 10 '22 02:11 ZhihuangLi1221