tobac
tobac copied to clipboard
Proposal interface for multiple methods in tobac v2.0
One of the biggest challenges with the v2.0 release of tobac is how to integrate multiple difference detection/tracking methods in a way that they can be mixed and matched freely without requiring major reworks of how those methods work. At present in the v2.0 branch we are one extreme with the "themes" framework which is essentially two libraries in one with little inter-compatibility. This allowed TINT to be quickly added to the tobac project. The other extreme would be to rework both the original tobac functions and TINT into a combined library, which would ensure inter-compatibility, but require a lot of dev work and make it difficult for existing users to switch workflows.
The following is a proposal for a third approach, using an interface layer that presents a common structure to the tobac user (i.e. what they see when they call "import tobac") without requiring major reworking of the various methods running in the background.
- Method categories
For the interface to work, each method must fit into one defined category. Although it would be nice to allow methods that, for example, perform both feature detection and segmentation, this would require a lot of extra complexity. For now, I think it's best to stick to single categories. Hopefully the only work required for adding new methods would be to split monolithic code blocks into separate sections for each category, which is probably a good think anyway.
The proposed categories are:
- Feature detection
- Segmentation
- Tracking
- Merging
- Splitting
For each of these categories we can then define a clear framework of what tobac expects each method to do, what parameters it receives and what it returns
Additional categories for ancillary methods, such as filtering, could also be added if it's useful
- Example interface
Here's an example of how such an interface would work for feature detection
2.1: Base class
Each category would need a base class, defined using abstract base class, to specify what tobac expects such a method to do"
from abc import ABC, abstractmethod
class Detection_Method(ABC):
"""
Framework for feature detection methods
"""
@abstractmethod
def set_field(self, field: xarray.DataArray) -> None:
pass
@abstractmethod
def get_field(self) -> xarray.DataArray:
pass
@abstractmethod
def detect_features(self, *args, **kwargs) -> xarray.Dataset:
pass
In this class we define a set of functions that tobac expects a category of methods to perform. They don't actually do anything, and use abstractmethod to raise a not implemented error if they are called, but specify a framework for inheritiance.
2.2 Detection method
For an actual detection method, we can create a class that inherits from the base class and actually defines the functions. Here's an example for multithreshold feature detection:
class Detection_Multithreshold(Detection_Method):
"""
Detect features using the tobac feature_detection_multithreshold method
"""
def set_field(self, field : xarray.DataArray) -> None:
self.field = xr_to_iris(field)
def get_field(self) -> xarray.DataArray:
return iris_to_xr(self.field)
def detect_features(self, *args, **kwargs) -> xarray.Dataset:
from tobac.themes.tobac_v1 import feature_detection_multithreshold
features_df = feature_detection_multithreshold(
self.field, *args, **kwargs)
return df_to_xr(features_df)
Within this we can also handle any type conversions required, so that we can maintain consistent data structures at the surface level of tobac without having to modify the actual functions. Note that because we inherit from the abstract base class, if we were lazy and didn't define any of the core functions then it would raise an exception rather than produce unexpected behaviour.
2.3 Method selection
Which method to use can be simply selected using a strategy method. See below for an example of tobac vs TINT:
def detection_strategy(method: str) -> Detection_Method:
"""
Select a feature detection method (probably could use enum instead)
"""
if method == "multithreshold":
return Detection_Multithreshold()
if method == "TINT":
return Detection_TINT()
...
There are better ways to implement this, but this is just a quick example
2.4 Feature detection
To bring it all together, this is the function that the user would actually call.
def feature_detection(
field: xarray.DataArray, method: str, *args, **kwargs
) -> xarray.Dataset:
"""
Detect features in a given dataarray using the specified method
"""
Detector = detection_strategy(method)
Detector.set_field(field)
features = Detector.detect_features(*args, **kwargs)
return features
Because all of the method inherit from a common base class, we don't have to worry about different methods requiring different function calls
2.5 Testing
We can also easily implement testing on this framework regardless of whether the underlying method is well or badly tested:
for method in detection_strategy:
Detector = detection_strategy(method)
test_set_field(Detector)
test_get_field(Detector)
test_detect_features(Detector)
This could, for example, isolate an error where a segmentation method was providing data in the wrong format to tracking, which might otherwise be difficult to debug
- Conclusion
An interface layer provides a relatively foolproof way of implementing multiple methods without the end-user knowing how they each individually work without requiring a lot of extra dev time. It also creates a clear framework for what does and doesn't fit within tobac, allowing us to be selective with what methods are added. It also allows us to stick the actual different codes out of the way in different themes, avoiding namespace conflicts, while still having all methods accessible when directly importing tobac.
If you have any feedback or suggestions I'd love to hear them, otherwise I'll go through this at the next dev meeting