ecom-rakuten icon indicating copy to clipboard operation
ecom-rakuten copied to clipboard

Winning solution for the Rakuten Data Challenge, as part of SIGIR eCom '18.

ecom-rakuten

Winning solution for the Rakuten Data Challenge, as part of SIGIR eCom '18.

The details of the model choices and evolution can be found in the system description paper for that workshop.

Usage

Data Preparation

Set up the expected data directories, from the repository root:

mkdir -p data/models

Move the challenge files into the data/ subdirectory:

mv path/to/rdc-catalog-train.tsv data/
mv path/to/rdc-catalog-test.tsv data/

Run a train/test split, build the vocabularies, and save the int-encoded training and validation sets for later:

./prep.sh

BPV Model Training

Train and save a forward model with the hyperparameters from the winning RDC solution (the model goes in data/models/model-name.h5):

./train.sh model-name

Train a reverse model, intended for use in building a bi-directional ensemble with a forward network:

./train.sh reverse-model --reverse

Train a model with some parameters different from the default:

./train.sh custom-model-name --n-epochs=20 --lr=1.2

See a full list of parameters available to tune via flags:

./train.sh -- --help

Inference, Prediction, and Scoring

Run an inference on the validation set, generate predictions, and then output precision, recall, and F1:

./infer.sh model-name
./infer.sh --forward=model-name  # equivalent

Score a reverse model:

./infer.sh --reverse=reverse-model

Similarly for a bi-directional ensemble:

./infer.sh --forward=model-name --reverse=reverse-model

Or for a larger ensemble, e.g. with 4 each forward and reverse:

./infer.sh --forward=fwd1,fwd2,fwd3,fwd4 --reverse=rev1,rev2,rev3,rev4

Since that can take awhile, you can show intermediate results along the way:

./infer.sh --forward=fwd1,fwd2,fwd3,fwd4 --reverse=rev1,rev2,rev3,rev4 --debug

To run test set inference and output prediction files for a single model, with ensembles working analogously to the commands above:

./infer.sh model-name --is-test