movie-posters-convnet
movie-posters-convnet copied to clipboard
Unsupervised clustering of movie posters with features extracted from Convolutional Neural Network
Demo
Overview
Unsupervised clustering of movie posters with features extracted from Convolutional Neural Network. Visualization using flask as a backend and d3js for the frontend.
This project is divided into 3 main scripts:
- get_posters.py
- retrieve the posters from impawards.com.
- create a thumbnail for each posters for the visualization.
- get_features_from_cnn.py
- get_data_visu.py
- dimension reduction for data-visualization with umap.
- compute the cosine similarity and extract the 6 ``closest'' images for each posters.
To get parameters descriptions:
- python src/get_XXX.py --help
Requirements
OS
- Linux/Unix/OSX (requirement for wget)
- Python 3.3+
- ImageMagick
- Postgresql
Packages Python
- BeautifulSoup 4.4
- Tensorflow
- Keras
- Pandas
- requests
- sklearn
- numpy
- PIL
- flask
Warnings
The extraction of the features from ConvNet is long if you do not owned a GPU. The computation of the similarity between each posters required O(n^2) in memory which required around 32Go of RAM.
Installation
Clone the depot:
$ git clone https://github.com/adrz/movie-posters-convnet.git
$ cd movie-posters-convnet/
$ virtualenv -p python3 env
$ source env/bin/activate
$ pip install -r requirements-gpu.txt
Create postgresql database (supposed you already install postgresql):
$ psql -U postgres -c "createuser movieposters;"
$ psql -U postgres -c "createdb movieposters;"
$ psql -U postgres -c "alter user movieposters with encrypted password 'yourpassword';"
$ psql -U postgres -c "grant all privileges on database movieposters to movieposters ;"
Usage
Computation
After cloning you can just launch the bash script that will:
- download posters from 1920 to 2016
- compute features
- compute the datavisualization features
$ python src/get_posters.py -c config/development.conf
$ python src/get_get_features_from_cnn.py -c config/development.conf
$ python src/get_data_visu.py -c config/development.conf
Then grab a coffee...
Visualization
$ source env/bin/activate
$ configapi=./config/development.conf
$ python app.py
Then launch index.html into your favorite browser:
$ chromium 127.0.0.1:5000/index.html
or
$ chromium 127.0.0.1:5000/index_complete.html
Results
Cherry-piking from the top-200 closest couple of posters (relative to cosine distance):
























License
This project is licensed under the MIT License - see the LICENSE.md file for details
Acknowledgments
- posters: IMP Awards