models
models copied to clipboard
[Task] Session-based preprocessing of `movielens-1m` dataset
Description
In the datasets API, we support the item-level preprocessing of three variants of movielens dataset. The goal of this task is to add a pre-processing function to generate the daily users' sessions.
Goals
- [ ] Define the ETL function to generate the session-based dataset
- [ ] Support the sequential ETL variant in get_movielens
Starting Point
In this closed PR, we have a first version of the session-level ETL: https://github.com/NVIDIA-Merlin/models/blob/5f10aafb4f4dd8bb44fff11382a5ca86311f5dba/merlin/models/data/movielens.py#L443