kaggle-predict-future-sales icon indicating copy to clipboard operation
kaggle-predict-future-sales copied to clipboard

Kaggle's Predict Future Sales competition project (TOP 15 solution as of March 2020)

Predict Future Sales

My attempt to solve Kaggle's Predict Future Sales competition.

Contains:

  • RNN approach (pytorch)
  • Data processing script in Spark (scala)
  • Creation of embeddings for item/item category/shop descriptions
  • Feature importance check using eli5
  • And more!

Report

For the solution report, check here.

Automatic hyperparameter tuning

For RNN-based model, automatic hyperparemeter tuning is available with Guild AI AutoML tool.

Check this document for details.

External resources

  • Datasets from https://www.kaggle.com/c/competitive-data-science-predict-future-sales/data
  • If you would like to use russian embeddings pretrained from wikipedia, download them from here

The default location for these is the data folder.

Dependencies

All you need is included in environment.yml. Conda is the package manager for this project.