pymovements
pymovements copied to clipboard
add support for event-only datasets
Description of the problem
There are a lot of datasets published that only include event data and no raw gaze samples.
It would be great if the dataset library could also support these.
Description of a solution
The DatasetDefinition
should probably extended with a new attribute that signifies the type of data to expect.
My first proposal would be datatype: Literal['raw', 'events']
. Let's brainstorm if we can find a better name.
Additionally, the Dataset.load()
method must be changed accordingly:
https://github.com/aeye-lab/pymovements/blob/cb9ef9571c5b24f7609928d18efc3ae2520c1d03/src/pymovements/dataset/dataset.py#L77-L135
Currently, it always runs Dataset.load_gaze_files()
, but this should then be dependent on DatasetDefinition.datatype
.
Also, events should be set to events: bool | None = None
and later assume a boolean value dependent on DatasetDefinition.datatype
I propose the following signature:
def load(
self,
*,
gaze: bool | None = None,
events: bool | None = None,
preprocessed: bool = False,
subset: dict[str, float | int | str | list[float | int | str]] | None = None,
events_dirname: str | None = None,
preprocessed_dirname: str | None = None,
extension: str = 'feather',
) -> Dataset:
This would be backwards compatible:
if dataset.definition.datatype == 'raw' and gaze is None:
gaze = True
else:
gaze = False
if dataset.definition.datatype == 'events' and events is None:
events = True
else:
events = False
Minimum acceptance criteria
- [ ] add argument to
DatasetDefinition
to indicate type of dataset - [ ] adjust
Dataset.load()
default values to decide loading gaze or events during runtime