predict.py node / probability accumulation : Usage clarifications
Hello,
As previously mentioned in the timeflux main repo issues, I'm trying to use the nodes exemplified in the c-VEP speller. Beyond the classifier, I'm stumbling a little bit in correctly understanding how to use the different arguments properly and perhaps to correctly understand the functioning of the node (speller/CVEP/speller/nodes/predict.py).
Particular questions :
-
Digging into this node inner working, I believe I must change the "trigger" of the "epoch" node in my pipelines, so as to have one by cycle instead of one by trial (of several cycles) as I had until now.
-
One confusing aspect for me is also the "buffer-related arguments" of 3 nodes : epoch (param: buffer), classification (param: buffer_size), and predict (params, min_buffer_size and max_buffer_size). I understand that the 2 first ones are related to buffer normally unnecessary signal in case of delay of transmission of events, whereas the predict buffers are relating to number of repetitions minimal/sufficient to broadcast a probability/classification. However, the practical consequences of these 2 arguments are a not yet straightforward to me.
-
Finally, the necessity and ways to implement the "reset events" (in the i_reset port) are not clear to me neither. I believe I'm supposed to emit an event on this port at the end of a trial, but the format that these reset events must take is not crystal clear. I can see that it is possibly a dataframe with a "label" column "reset_{source}_accumulation". One ambiguity for me, it could be a reset of the accumulated 'calibration' data, or the reset of all the cycles after each trail is over during the test phase (predict_proba).
I hope that the uncertainties I raised are understandable in how I explained them, and am looking forward for clarifications of these various interesting nodeq which are core to the superiority offered by timeflux for live time-series ML pipelines.
Best,
Julien
Hi,
Apologies for answering only now.
The predict node accumulates probabilities from the classification engine and emits a final prediction when enough confidence is reached.
-
I haven't touched the code in a while, but I believe events are sent for each cycle in the current interface. But I'm not sure about what you're trying to achieve.
-
You're right in your interpretation of the buffers for the first two nodes. It allows to capture data even if the event triggers are delayed. Regarding
min_buffer_sizeandmax_buffer_sizeinpredict, the behavior is a bit different. This is a circular buffer that allows to fine tune the decision model.min_buffer_sizeis the minimum of predictions to accumulate from the classification engine before even attempting to emit a decision. For example, you may want to wait a little bit before taking a decision in case the subject gaze has been captured by an adjacent stim.max_buffer_size, on the other hand, allows to get rid of past predictions, in case the subject has been distracted in previous trials, otherwise it would take too much time to emit a decision, knowing that early predictions might not be relevant. There is a balance to be found, depending on your EEG headset, the subject, environmental conditions, etc. -
You can play with and adjust these parameters from the web interface. The
skey will bring a configuration menu that will send a reset event to the prediction engine. It is not strictly required for the classification and prediction engines to work, it just allows you to change parameters on the fly. See: https://github.com/timeflux/demos/blob/225837acd7aac6319bca2f2c60e7a4bf1817381b/speller/CVEP/speller/gui/assets/js/app.js#L58-L76