movement
movement copied to clipboard
[PROPOSAL] Interface for derived (kinematic) variables
This issue is meant for discussing the way we compute, access and store variables derived from pose tracks in movement
.
Apologies for the very long read, but this is an important design choice and warrants some deliberation. The conversation started during our last meeting with @lochhh and @sfmig. This is my attempt to write it up.
Context: movement's dataset structure
Predicted poses (pose tracks) are represented in movement
as an xarray.Dataset
object - hereafter referred to as a movement dataset (ds
in the example code snippets below).
Right after loading, each movement dataset contains two data variables stored as xarray.DataArray
objects:
-
position
: with shape (time
,individuals
,keypoints
,space
) -
confidence
: with shape (time
,individuals
,keypoints
)
You can think of each data variable as a multi-dimensional pandas.DataFrame
or as a numpy.array
with labeled axes. In xarray
terms, each axis (.e.g. time
) is called a dimension (dim
), while the lableled 'ticks' along each axis are called coordinates (coords
).
Grouping data variables together in a dataset makes sense when they share some common dims
. In the movement dataset the two variable share 3 out of 4 dims
(see image above).
Other related data that do not constitute arrays but instead take the form of key-value pairs can be stored as attributes - i.e. inside the attrs
dictionary.
All data variables and attributes can be conveniently accessed via the usual .
attribute syntax, e.g.:
position = ds.position # equivalent to ds["position"]
confidence = ds.confidence # equivalent to ds["confidence"]
fps = ds.fps # equivalent to ds.attrs["fps"]
Problem formulation
The position
and confidence
data variables (+ some attributes) are created automatically after loading predicted poses from one of our supported pose estimation frameworks.
The question is what to do with variables that movement
derives from these 'primary' variables. For purposes of illustration we will consider three example variables:
-
velocity
: which is anxarray.DataArray
object with the samedims
andcoords
asposition
. -
velocity_pol
: velocity in polar coordinates. As of PR #155, this is a transformation of the above variable from cartesian to polar coordinates. It's also anxarray.DataArray
, but itsspace
dimension is replaced byspace_pol
, withrho
(magnitude) andphi
(angle) as coordinates. -
speed
: this is the magnitude (euclidean norm) ofvelocity
and is therefore equivalent to therho
invelocity_pol
. This could be a represented as anxarray.DataArray
that lacks a spatial dimension alltogether (similar to theconfidence
variable)
Alternatives
Each of the above derived xarray.DataArray
objects could be requested and stored in a variety of ways. Below, I'll go through some alternatives, and attempt to supply pros/cons for each:
1. Status quo: derived variables as accessor properties
The status quo relies on extending xarray using accessors. In short, accessors are xarray's way of adding domain-specific funcitonality to its xarray.DataArray
and xarray.Dataset
objects. This is strongly preferred over the standard way of extending objects (inheritance).
Accordingly, we have implemented a MoveAccessor
, which extends xarray.Dataset
and is accessed via the keyword "move". For example:
ds.cumsum() # built-in xarray method
ds.sizes # built-in xarray attribute
ds.move.validate() # our movement-specific validation method
ds.move.velocity # our movement-specific velocity property
Currently, derived variables can be computed via the accessor - ds.move.velocity
. Under the hood, when we access the property for the first time, velocity
is computed and stored as a data variable within the original dataset, alongside position
and confidence
. Once computed, it can be accessed in the same way as the 'primary' variables - i.e. ds.velocity
or ds["velocity"]
.
All currently implemented kinematic variables - displacement
, velocity
, and acceleration
- behave in this way. Through PR #155, so do their polar transformations.
velocity = ds.move.velocity
velocity_pol = ds.move.velocity_pol
speed = ds.move.speed
Pros
- derived variables are automatically "saved" in a sensible way within the dataset and continue being available for future usage. After they are first computed, they consistently behave like the 'primary' data variables.
- we potentially spare some computation. If a user requests
velocity
again, the stored variable will be returned. - Ease of use: users don't need to know that speed is the magnitude of the velocity vector or how it's exactly computed. They ask for things and they get them (with all the necessary steps happening under the hood).
Cons
- Unfamiliar syntax. Users unacquinted with accessors (which is almost all users) may find the
.move
syntax strange and may not expect the automatic storage of variables. - Data duplication. If we follow this strategy for all three of
velocity
,velocity_pol
andspeed
, we will be storing the same data in many different ways.velocity_pol
is just a simple transform ofvelocity
, so it may not be worth storing within the dataset. The case is even more extreme forspeed
, if we store bothvelocity_pol
andspeed
, we would be keeping the exact same array of numbers twice. Moreover, callingds.move.speed
would result in calling bothds.move.velocity
andds.move.velocity_polar
under the hood, and users may be surprised by all the extra data variables they suddenly end up with.
2. Getting derived variables via accessor methods
This alternative still relies on the MoveAccessor
, but gets to the derived variables via custom methods, instead of custom properties.
For example:
velocity = ds.move.compute_velocity(coordinates="cartesian")
velocity_polar = ds.move.compute_velocity(coordinates="polar")
speed = ds.move.compute_speed()
# the above could be an alias for sth like
speed = ds.move.compute_velocity(coordinates="polar").sel[space_pol="rho"].squeeze()
Each of these methods would return a separate xarray.DataArray
object which would NOT be automatically stored in the original dataset.
If the user wishes to store these in the dataset, they could do so explicitly:
ds["velocity"] = ds.move.compute_velocity()
Pros
- More explicit, less unexpected things happen under the hood.
- No automatic data duplication.
- Arguments can be passed to the methods (e.g. "cartesian" vs "polar" in the above example).
Cons
- Still relies on the unfamiliar
.move
syntax. - May result in redundant/duplicate computations. For example,
ds.move.compute_speed()
would re-computevelocity_polar
to get its magnitude, even if the user had previously computedvelocity_polar
(but hadn't stored it in the same dataset). - Not as easy to use as alternative 1.
3. A mix of accessor properties and methods
From the above, it seems like using accessor properties duplicates data, while using accessor methods duplicates computation. Maybe it's possible to strike a balance between the two:
- Some 'priviledged' variables may behave as in alternative 1, i.e. they will be accessed via properties and automatically stored in the dataset. Kinematic variables in cartesian coordinates (e.g.
velocity
,acceleration
) would be good candidates for this. - Other variables, especially those that can be trivially derived from the 'priviledged' ones will not be automatically stored and will be accessible via custom accessor methods.
This mixed approach could look something like this:
velocity = ds.move.velocity
velocity_pol = velocity.move.cart2pol()
speed = velocity.move.magnitude()
This variant would require us to provide an extra accessor to extend xarray.DataArray
objects and specifically operate on data variables that contain an appropriate spatial dimension (this is where the cart2pol
and magnitude
methods would be implemented).
Pros
- balances the computation and storage needs
Cons
- Still relies on unfamiliar
.move
syntax. - Some variables behaving one way and others behaving in another way is inconsistent and will probably confuse users.
- It requires us to implement an additional accessor for
xarray.DataArray
objects, potentially leading to further confusion. - Requires some knowledge from the user (e.g. they should know that speed is the magnitude of velocity).
4. Use both accessor properties and methods
Another approach could be to always supply both alternatives 1 and 2 for every variable, so the user could choose between them:
# This would automatically store the variables
# in the dataset, as in alternative 1
velocity = ds.move.velocity
velocity_pol = ds.move.velocity_pol
# This would NOT automatically store the variables
# in the dataset, as in alternative 2
velocity = ds.move.compute_velocity(coordinates="cartesian") # alternative 2
velocity_pol = ds.move.compute_velocity(coordinates="polar") # alternative 2
Pros
- Flexibility: the user can choose to prioritise computation or memory
Cons
- Developer overhead, we have to implement and test both ways of doing things
- We now have two ways of doing things, both of which rely on the unfamiliar
.move
syntax. - the trade-offs between the two methods may not be readily apparent, and users may be unsure which one to use, leading to situations where we compromise both computation and memory.
5. Forget about accessors
We can always abandon the accessor way of doing things, and (given that inheritance and composition are discouraged for xarray
objects) forget about object-oriented programming (OOP) altogether.
We could instead rely on analysis and utility functions that take one xarray.DataArray
, apply some operation to it, and return another xarray.DataArray
, e.g.:
from movement.analysis import kinematics as kin
from movement.utils.vector import cart2pol, magnitude
velocity = kin.compute_velocity(ds["position"])
velocity_pol = cart2pol(velocity)
speed = magnitude(velocity)
The above is already possible by the way (apart form the magnitude()
function, which could be easily added).
Pros
- The most explicit of all options (no hidden magic)
- no fussing with accessors and OOP, even Python beginners could understand the syntax.
- Easy to implement and test.
Cons
- Not very convenient to use. Users need to know exactly which modules to import and which functions to call.
- More verbose
My personal take
After considering these alternatives, I lean towards sticking with the status quo (alternative 1) - i.e. every derived variable is an accessor property, and they all get stored as data variables in the dataset, duplication be damned.
This means that users will have to get used to the slightly strange .move
syntax and behaviour, but at least these will be consistent throughout and there will be one main syntax to learn.
Power users who wish to override the default 'magic' behaviour can do so by using alternative 5, which already works anyway (and is what actually happens under the hood).
That said, I'm open to counter-arguments, and there may well be alternatives I haven't considered, so please chime in @neuroinformatics-unit/behaviour @b-peri !