graspologic
graspologic copied to clipboard
[BUG] Does SSG handle ties appropriately?
Expected Behavior
when computing a subgraph, ties are randomly broken.
Actual Behavior
The code is fairly opaque to me relative the "traditional" algorithms for solving signal subgraph, I'm sure they are great/way faster than how I would have implemented, but I can't totally follow exactly what's going on so I can't pinpoint specific lines. However, when ties are experienced in rankings, they should be randomly broken, not broken according to the metric that numpy uses (usually, lowest according to the datatype being ranked, to my knowledge). As far as I can tell, every sory/argsort/minimum/maximum called in the code uses the bare-bones numpy defaults for breaking ties, which is incorrect.
Suggested (potential) fix: write an implementation that uses scipy stats rankdata instead, use method to break ties as the average, and then use the number of tied entries for a particular duplicated value to obtain the "value range" that the tied entries would encompass. Then, just use numpy random choice to select from that value range without replacement and at random to reassign "randomly broken" ties. Alternatively, for every sort called, instead start by randomly reassigning positions, then do the sort/minimum/maximum/etc. (which will break ties in order of appearance, which is now random), and then find the mapping from initial appearance to randomly resorted appearance and then sort the initial array using the double mapping.
Any other information?
The procedures employed are not statistically principled.