movement icon indicating copy to clipboard operation
movement copied to clipboard

[PROPOSAL] Interface for derived (kinematic) variables

Open niksirbi opened this issue 2 months ago • 8 comments

This issue is meant for discussing the way we compute, access and store variables derived from pose tracks in movement. Apologies for the very long read, but this is an important design choice and warrants some deliberation. The conversation started during our last meeting with @lochhh and @sfmig. This is my attempt to write it up.

Context: movement's dataset structure

Predicted poses (pose tracks) are represented in movement as an xarray.Dataset object - hereafter referred to as a movement dataset (ds in the example code snippets below).

Right after loading, each movement dataset contains two data variables stored as xarray.DataArray objects:

  • position: with shape (time, individuals, keypoints, space)
  • confidence: with shape (time, individuals, keypoints)

dataset_structure

You can think of each data variable as a multi-dimensional pandas.DataFrame or as a numpy.array with labeled axes. In xarray terms, each axis (.e.g. time) is called a dimension (dim), while the lableled 'ticks' along each axis are called coordinates (coords).

Grouping data variables together in a dataset makes sense when they share some common dims. In the movement dataset the two variable share 3 out of 4 dims (see image above).

Other related data that do not constitute arrays but instead take the form of key-value pairs can be stored as attributes - i.e. inside the attrs dictionary.

All data variables and attributes can be conveniently accessed via the usual . attribute syntax, e.g.:

position = ds.position   # equivalent to ds["position"]
confidence = ds.confidence   # equivalent to ds["confidence"]
fps = ds.fps  # equivalent to ds.attrs["fps"]

Problem formulation

The position and confidence data variables (+ some attributes) are created automatically after loading predicted poses from one of our supported pose estimation frameworks.

The question is what to do with variables that movement derives from these 'primary' variables. For purposes of illustration we will consider three example variables:

  • velocity: which is an xarray.DataArray object with the same dims and coords as position.
  • velocity_pol: velocity in polar coordinates. As of PR #155, this is a transformation of the above variable from cartesian to polar coordinates. It's also an xarray.DataArray, but its space dimension is replaced by space_pol, with rho (magnitude) and phi (angle) as coordinates.
  • speed: this is the magnitude (euclidean norm) of velocity and is therefore equivalent to the rho in velocity_pol. This could be a represented as an xarray.DataArray that lacks a spatial dimension alltogether (similar to the confidence variable)

Alternatives

Each of the above derived xarray.DataArray objects could be requested and stored in a variety of ways. Below, I'll go through some alternatives, and attempt to supply pros/cons for each:

1. Status quo: derived variables as accessor properties

The status quo relies on extending xarray using accessors. In short, accessors are xarray's way of adding domain-specific funcitonality to its xarray.DataArray and xarray.Dataset objects. This is strongly preferred over the standard way of extending objects (inheritance).

Accordingly, we have implemented a MoveAccessor, which extends xarray.Dataset and is accessed via the keyword "move". For example:

ds.cumsum()  # built-in xarray method
ds.sizes  # built-in xarray attribute

ds.move.validate()  # our movement-specific validation method
ds.move.velocity  # our movement-specific velocity property

Currently, derived variables can be computed via the accessor - ds.move.velocity. Under the hood, when we access the property for the first time, velocity is computed and stored as a data variable within the original dataset, alongside position and confidence. Once computed, it can be accessed in the same way as the 'primary' variables - i.e. ds.velocity or ds["velocity"].

All currently implemented kinematic variables - displacement, velocity, and acceleration - behave in this way. Through PR #155, so do their polar transformations.

velocity = ds.move.velocity
velocity_pol = ds.move.velocity_pol
speed = ds.move.speed

Pros

  • derived variables are automatically "saved" in a sensible way within the dataset and continue being available for future usage. After they are first computed, they consistently behave like the 'primary' data variables.
  • we potentially spare some computation. If a user requests velocity again, the stored variable will be returned.
  • Ease of use: users don't need to know that speed is the magnitude of the velocity vector or how it's exactly computed. They ask for things and they get them (with all the necessary steps happening under the hood).

Cons

  • Unfamiliar syntax. Users unacquinted with accessors (which is almost all users) may find the .move syntax strange and may not expect the automatic storage of variables.
  • Data duplication. If we follow this strategy for all three of velocity, velocity_pol and speed, we will be storing the same data in many different ways. velocity_pol is just a simple transform of velocity, so it may not be worth storing within the dataset. The case is even more extreme for speed, if we store both velocity_pol and speed, we would be keeping the exact same array of numbers twice. Moreover, calling ds.move.speed would result in calling both ds.move.velocity and ds.move.velocity_polar under the hood, and users may be surprised by all the extra data variables they suddenly end up with.

2. Getting derived variables via accessor methods

This alternative still relies on the MoveAccessor, but gets to the derived variables via custom methods, instead of custom properties.

For example:

velocity = ds.move.compute_velocity(coordinates="cartesian")
velocity_polar = ds.move.compute_velocity(coordinates="polar")

speed = ds.move.compute_speed()
# the above could be an alias for sth like
speed = ds.move.compute_velocity(coordinates="polar").sel[space_pol="rho"].squeeze()

Each of these methods would return a separate xarray.DataArray object which would NOT be automatically stored in the original dataset.

If the user wishes to store these in the dataset, they could do so explicitly:

ds["velocity"] = ds.move.compute_velocity()

Pros

  • More explicit, less unexpected things happen under the hood.
  • No automatic data duplication.
  • Arguments can be passed to the methods (e.g. "cartesian" vs "polar" in the above example).

Cons

  • Still relies on the unfamiliar .move syntax.
  • May result in redundant/duplicate computations. For example, ds.move.compute_speed() would re-compute velocity_polar to get its magnitude, even if the user had previously computed velocity_polar (but hadn't stored it in the same dataset).
  • Not as easy to use as alternative 1.

3. A mix of accessor properties and methods

From the above, it seems like using accessor properties duplicates data, while using accessor methods duplicates computation. Maybe it's possible to strike a balance between the two:

  • Some 'priviledged' variables may behave as in alternative 1, i.e. they will be accessed via properties and automatically stored in the dataset. Kinematic variables in cartesian coordinates (e.g. velocity, acceleration) would be good candidates for this.
  • Other variables, especially those that can be trivially derived from the 'priviledged' ones will not be automatically stored and will be accessible via custom accessor methods.

This mixed approach could look something like this:

velocity = ds.move.velocity
velocity_pol = velocity.move.cart2pol()
speed = velocity.move.magnitude()

This variant would require us to provide an extra accessor to extend xarray.DataArray objects and specifically operate on data variables that contain an appropriate spatial dimension (this is where the cart2pol and magnitude methods would be implemented).

Pros

  • balances the computation and storage needs

Cons

  • Still relies on unfamiliar .move syntax.
  • Some variables behaving one way and others behaving in another way is inconsistent and will probably confuse users.
  • It requires us to implement an additional accessor for xarray.DataArray objects, potentially leading to further confusion.
  • Requires some knowledge from the user (e.g. they should know that speed is the magnitude of velocity).

4. Use both accessor properties and methods

Another approach could be to always supply both alternatives 1 and 2 for every variable, so the user could choose between them:

# This would automatically store the variables
# in the dataset, as in alternative 1
velocity = ds.move.velocity
velocity_pol = ds.move.velocity_pol

# This would NOT automatically store the variables
# in the dataset, as in alternative 2
velocity = ds.move.compute_velocity(coordinates="cartesian")  # alternative 2 
velocity_pol = ds.move.compute_velocity(coordinates="polar")  # alternative 2

Pros

  • Flexibility: the user can choose to prioritise computation or memory

Cons

  • Developer overhead, we have to implement and test both ways of doing things
  • We now have two ways of doing things, both of which rely on the unfamiliar .move syntax.
  • the trade-offs between the two methods may not be readily apparent, and users may be unsure which one to use, leading to situations where we compromise both computation and memory.

5. Forget about accessors

We can always abandon the accessor way of doing things, and (given that inheritance and composition are discouraged for xarray objects) forget about object-oriented programming (OOP) altogether.

We could instead rely on analysis and utility functions that take one xarray.DataArray, apply some operation to it, and return another xarray.DataArray, e.g.:

from movement.analysis import kinematics as kin
from movement.utils.vector import cart2pol, magnitude

velocity = kin.compute_velocity(ds["position"])
velocity_pol = cart2pol(velocity)
speed = magnitude(velocity)

The above is already possible by the way (apart form the magnitude() function, which could be easily added).

Pros

  • The most explicit of all options (no hidden magic)
  • no fussing with accessors and OOP, even Python beginners could understand the syntax.
  • Easy to implement and test.

Cons

  • Not very convenient to use. Users need to know exactly which modules to import and which functions to call.
  • More verbose

My personal take

After considering these alternatives, I lean towards sticking with the status quo (alternative 1) - i.e. every derived variable is an accessor property, and they all get stored as data variables in the dataset, duplication be damned.

This means that users will have to get used to the slightly strange .move syntax and behaviour, but at least these will be consistent throughout and there will be one main syntax to learn.

Power users who wish to override the default 'magic' behaviour can do so by using alternative 5, which already works anyway (and is what actually happens under the hood).

That said, I'm open to counter-arguments, and there may well be alternatives I haven't considered, so please chime in @neuroinformatics-unit/behaviour @b-peri !

niksirbi avatar Apr 09 '24 16:04 niksirbi