BrainBay icon indicating copy to clipboard operation
BrainBay copied to clipboard

Median calculation in Threshold

Open ElliotMebane opened this issue 5 years ago • 3 comments

I noticed that the use of median seemed to favor the last value of the range instead of the whole range of values used in the interval. I checked the calculation and the median calculation only uses the last iterator value: for (i = 0, sum = 0; (i <= 1024) && (sum < numtrue); i++) { sum += buckets[i]; } to_input = size_value(0.0f,1024.0f,(float)i,in_ports[0].in_min,in_ports[0].in_max,0);

It looks like the buckets need to be sorted then the middle value in the list should be chosen.

ElliotMebane avatar May 17 '19 05:05 ElliotMebane

uups !

that bug was in there for a log time! actually the median value was added by a contributor and i did not reallly check it's functionality well enough ! i'll have a look when time permits,

ChrisVeigl avatar May 17 '19 07:05 ChrisVeigl

It looks like the buckets need to be sorted then the middle value in the list should be chosen.

the code for median was contributed before Brainbay was under version control. after having a look I'm not so sure that the calcualtion is wrong:

the buckets are used in order to prevent sorting the incoming values (to save computation effort particularily for larger intervals). they divide the whole singal range into 1024 "bins" of equal size (sacrificing precision). for an incoming value it's associated bin is increased by one. so above for loop IMO makes sense to find the bottom x% or top x% of the values (represented by bottom / top numtrue bin entries, where numtrue is the number of samples in that interval * x/100)

ChrisVeigl avatar May 19 '19 12:05 ChrisVeigl

OK, I thought the traditional use of the term median was being used (middle bucket).

The fan on my VR-ready laptop engages when BrainBay runs, so I suppose all the optimization that can be done is worth it.

-- new values outside the min/max settings get clipped to the lowest/highest bucket in incoming_data method. Not sure what impact that may have. -- there are 1025 entries in the bucket, FYI. The big/small adapt blocks use for loops that seem to be consistent with that length (one counts up and the other counts down), but be careful not to assume the length is 1024.
-- I"m not sure if the for loops in the bigadapt/smalladapt blocks are counting in the correct directions. The bigadapt block counts backwards until the percentage has been met. So a high percentage would trim off the bulk of the top values, returning a number on the low side of the range.

ElliotMebane avatar May 19 '19 18:05 ElliotMebane