models icon indicating copy to clipboard operation
models copied to clipboard

[Task] Session-based preprocessing of `movielens-1m` dataset

Open sararb opened this issue 3 years ago • 0 comments

Description

In the datasets API, we support the item-level preprocessing of three variants of movielens dataset. The goal of this task is to add a pre-processing function to generate the daily users' sessions.

Goals

  • [ ] Define the ETL function to generate the session-based dataset
  • [ ] Support the sequential ETL variant in get_movielens

Starting Point

In this closed PR, we have a first version of the session-level ETL: https://github.com/NVIDIA-Merlin/models/blob/5f10aafb4f4dd8bb44fff11382a5ca86311f5dba/merlin/models/data/movielens.py#L443

sararb avatar Sep 13 '22 13:09 sararb