autopilot Task Slots & Lifecycle Methods

The Task class is hopelessly barebones at the moment and could use some hooks. Starting to sketch what it should have given experience with what people have wanted. very tired atm so this is just a stub atm.

Lifecycle Methods:

input before run: if running task in terminal or other interactive mode, get parameters like weight (or not) or any other parameters that might change every time the task is run
before_run: standardize the init_hardware et al. that are pretty badly hardcoded for 2afc class into slots. Would need to think about how this interacts with current task parameterization in class attrs
post_run: release resources, compute on collected data, send it somewhere, etc.

Slots/params:

data target: where to send data to? enable local running of tasks, so being able to make a subject file locally (instead of the weird double implementation of the local data file
triggers!!! this should be much more formalized so that we can used a thread pool and manage concurrency rather than the very freeform trigger system rn
inputs!!! properties that can be targeted from other tasks/hardware components to affect the task!

Feb 09 '22 00:02 sneakers-the-rat

I'm pretty psyched about all of these proposed changes! I would add a few random thoughts:

Some of the logic in Task seems to kind of assume a 2AFC structure. Most notably, I think pokes ALWAYS trigger a state change (which makes sense in a classic 3-port box, but not for instance in my task, where I just want to log certain poke events without doing anything about it). In my task I had to overload handle_trigger to change this behavior and it is pretty messy.
I think at some point I was changing the TrialData that is returned to the Terminal, which I think requires redefining the HDF5 file, which I think is not easy (or at least not clean) to do without also regenerating the HDF5 files for each subject. What I ended up doing is just dumping all the data that I want to return to the logfiles instead, and writing custom parsers for those logfiles, but that is not so clean. I can't really imagine a fix to this issue that doesn't totally reimagine (and break backwards compatibility) for the way data is logged, so this is not a high-priority feature to implement.

For instance, before I used autopilot I used something I custom-wrote for myself, which would basically generate a JSON file of task parameters for each session. These parameters originated from the mouse name and the box name (e.g. water durations) so mice could be moved from box to box no problem (as I understand it, this is not possible right now with Autopilot, have to create a new subject to move a mouse to a new box). The task code itself would read these JSON files and use them to run the task. It would dump out all of the results in plain text files. Then later, the results are loaded from these plain text files into more structured data tables, so it's possible to reimagine the way they are loaded without changing the way the raw data is stored. These thoughts are kind of bumping around in my head to think about how we could implement something like this in Autopilot at some point. Would be a substantial change though so not something to enter into lightly

:+1:

Feb 09 '22 01:02 cxrodgers

Some of the logic in Task seems to kind of assume a 2AFC structure.

totally agree, I hadn't looked at it in awhile because I have been working on hardware and architectural things but yeah it's pretty bad. Will be reworking as soon as i can as part of https://github.com/wehr-lab/autopilot/issues/31

Most notably, I think pokes ALWAYS trigger a state change (which makes sense in a classic 3-port box, but not for instance in my task, where I just want to log certain poke events without doing anything about it). In my task I had to overload handle_trigger to change this behavior and it is pretty messy.

aha you mean like there is some trigger but it's just to log the event, but you don't want to clear the flag to advance the stage. Definitely need to at least document this behavior, but yes part of a Trigger object should definitely be more explicit behavior about stage advancement a la FSMs (which is sort of implicit in the stage logic, ie. you could define a generator function that yields the same stage, but that's very far from obvious.)

I think at some point I was changing the TrialData that is returned to the Terminal, which I think requires redefining the HDF5 file, which I think is not easy (or at least not clean) to do without also regenerating the HDF5 files for each subject. What I ended up doing is just dumping all the data that I want to return to the logfiles instead, and writing custom parsers for those logfiles, but that is not so clean. I can't really imagine a fix to this issue that doesn't totally reimagine (and break backwards compatibility) for the way data is logged, so this is not a high-priority feature to implement.

I don't think I follow, you mean like you wanted to update the task/make a new version of it that had different fields returned? It should be the case that when you reassign a task it stashes the prior data and recreates all the tables ( https://github.com/wehr-lab/autopilot/blob/e408c08e76df8d8c930edfb4ba58e31cd1ec4d87/autopilot/core/subject.py#L394 ) the idea is to document all those changes, but if it's not doing that then that's definitely a bug and would welcome an issue.

For instance, before I used autopilot I used something I custom-wrote for myself, which would basically generate a JSON file of task parameters for each session. These parameters originated from the mouse name and the box name (e.g. water durations) so mice could be moved from box to box no problem (as I understand it, this is not possible right now with Autopilot, have to create a new subject to move a mouse to a new box).

There's nothing really intrinsic about the box in the protocol or subject file, I haven't enabled a drag and drop type interface to switch them but if you just add the subject id to the list of subjects in the pilot_db.json on the terminal there shouldn't be a problem.

The task code itself would read these JSON files and use them to run the task. It would dump out all of the results in plain text files. Then later, the results are loaded from these plain text files into more structured data tables, so it's possible to reimagine the way they are loaded without changing the way the raw data is stored. These thoughts are kind of bumping around in my head to think about how we could implement something like this in Autopilot at some point. Would be a substantial change though so not something to enter into lightly

I'd love to hear more of your thoughts on this, I think i might be missing something so would love to see an example of what you're seeing. The data storage and declaration system is pretty brittle now and definitely needs to be reworked (specifically I want to unify the notion of parameters throughout the system which are currently implemented differently in (at least) scripts, task.TrialData, task.PARAMS, requires so that it's possible to swap around data specifications and output formats and the like from a composable param spec that has dtypes et al). but yes the idea is that you just directly save things into the format you want them at the time of acquisition, and so supporting changes to the data that's returned/etc is absolutely the plan. Continuous data just makes a new column when it gets a new kind of key and there's no real reason that couldn't be done with trial data either.

Feb 09 '22 23:02 sneakers-the-rat

Oh forgot to mention, if you just want to log events you can do it in a few ways.

calling Digital_In with record=True will give you a deque of (event, timestamp) tuples in the .events attribute
you can add a callback with the sort of unfortunately named assign_cb method that sends each back to the terminal for storage as continuous data with a Net_Node object, sending a message like node.send('T', 'DATA', {'continuous':True, 'event':'timestamp', 'subject':'subject_id'}) back to the terminal, which should store it in the continuous data table for that session. Continuous data is more permissive (it should try and make a column and table for it even if it doesn't know about it beforehand with a ContinousData descriptor, but i haven't used that for awhile and don't have a test for it unfortunately.)
as part of a stage method you could collect all the Digital_In.events and batch send them then if it's important to group them in a trial structure.

Feb 09 '22 23:02 sneakers-the-rat

> I don't think I follow, you mean like you wanted to update the task/make a new version of it that had different fields returned? It should be the case that when you reassign a task it stashes the prior data and recreates all the tables (

Oh I see... well I think I probably wasn't doing something right then. What I remember doing is that I added a field of data to be returned in TrialData, but it wasn't storing in the HDF5 file for the subject. If I created a new subject, then it created a new HDF5 file which included the new field. I concluded from that experience that it wasn't capable of changing TrialData on the fly, but now it sounds like I was doing something wrong. I'll see if I can recapitulate and then file an issue.

Feb 10 '22 01:02 cxrodgers

oh yes you're right, that's undocumented and unclear, and therefore a bug! the subject class detects changes in the protocol.json file and updates, but not the task itself. there's no reason to ever drop data I don't think, and the logic to add a column on the fly is straightforward, so doing so with a warning I think is the fix. it's a tradeoff between clear specification and avoiding frustrating workarounds and I almost always prefer the latter. it should probably log in the subjects history table as well

Feb 10 '22 09:02 sneakers-the-rat

autopilot autopilot copied to clipboard

Task Slots & Lifecycle Methods

autopilot
autopilot copied to clipboard