Self-contain `sleap.nn` code
Goal
The goal of the PRs that solve this issue is to self-contain the sleap.nn code and essentially create a "sleap-nn" placeholder. Unlike sleap-nn which has turned into a long-develop project, the PRs that solve this issue will basically just rearrange existing code so that all the sleap.nn code is self contained and can be moved to it's own package. This will also support 🤞🏼 seamless replacement with the actual sleap-nn package.
Caveat
However, there are a few places where sleap.gui imports from the sleap.nn code. Here we keep track of those places and plan resolutions (which will also affect the main sleap repo). Ideally, sleap.nn code is ONLY used (and only needs sleap-nn installation) IF the user wants to run training through the GUI.
Suggestion
We find that often, we access the model and training job config classes through the GUI - which is expected as we want to be able to write training configuration and load model/training configurations from the GUI. Perhaps the best place for these classes is in the middle-man sleap-io library?
Places where sleap.nn code is used (outside of sleap.nn)
sleap/__init__.py
https://github.com/talmolab/sleap/blob/ad7c563bc2bbbf8acee1dd5ac472a6b3ae116a52/sleap/init.py#L14-L21
sleap/instance.py
LabeledFrame.plotuses methods fromsleap.nn.vizLabeledFrame.plot_predicteduses methods fromsleap.nn.viz- #2168
sleap/gui/learning/config.py
ConfigFileInfo.from_config_fileusessleap.nn.config.TrainingJobConfigTrainingJobConfigsGetter.try_loading_pathusessleap.nn.config.TrainingJobConfig
sleap/gui/learning/datagen.py
show_datagen_previewusessleap.nn.data.providers.LabelsReadermake_datagen_resultsuses methods fromsleap.nn.data
sleap/gui/learning/receptivefield.py
receptive_field_info_from_model_cfgusessleap.nn.model.ModelConfigandsleap.nn.model.ModelReceptiveFieldWidget.setModelConfigusessleap.nn.model.ModelConfig
sleap/gui/learning/runners.py
write_pipeline_filesusessleap.nn.training.setup_new_run_folderrun_gui_trainingusessleap.nn.training.setup_new_run_foldertrain_subprocessusessleap.nn.config.TrainingJobConfig
sleap/gui/learning/scopedkeydict.py
make_training_config_from_key_val_dictusessleap.nn.config.TrainingJobConfigmake_model_config_from_key_val_dictusessleap.nn.config.ModelConfig
sleap/gui/learning/base.py
ModelData.__getitem__usessleap.nn.data.providers.VideoReaderDataOverlay.make_predictorusessleap.nn.inference.VisualPredictor- #2167
sleap/gui/widgets/monitor.py
LossViewer.resetusessleap.nn/config.training_job.TrainingJobConfig- #2162
sleap/info/trackcleaner.py
fit_tracksusessleap.nn.tracking.TrackCleaner- #2161
sleap/io/dataset.py
Labels.to_pipelineusessleap.nn.data.pipelines- #2160
sleap/io/video.py
Video.to_pipelineusessleap.nn.data.pipelines- #2160
I think the tough part of the refactor here will be the config.
The config (TrainingJobConfig) serves multiple purposes:
- Specify the hyper parameters for training
- Document metadata about the model
(1) is specific to the backend implementation. The sleap-nn config will be a bit different and won't have the exact same fields. The problem is that the GUI right now is pretty tightly coupled to this. We could remap the fields, but the logic in the config getter and builder are a nightmare.
The training config editor is overdue for a refresh and we have to gut it no matter what. I propose we have two UIs: a simple one with the most common presets and parameters, and another that is basically a field-value table that has the full list of config items. The first requires explicit mappings, but this would be more manageable with fewer fields. The second could be auto populated from a schema, so no change required for the different backends -- and it exposes all the settings for advanced users automatically.
(2) could be made to be a bit more abstracted from the backend and is probably the right pattern, but it'll take a bit of work to implement. We need to list out what properties of the model are necessary in the GUI (e.g., model type, number of labeled frames, etc.) in a dataclass. Then we have something that pulls out or infers these values from the config.
The tricky decoupling will need to deal with config fields that are computed on the fly downstream. For example, some config fields are inferred from the data, which requires iterating over the dataset so we defer those to model building in the backend. Some of this logic is in the network factories themselves (e.g., the UNet class), so it's hard to not duplicate that logic if we can't import sleap.nn -- unless we define the high level set of model metadata.