Chord Suggester

Initial note: This readme explains how to run this project. For a detailed memory about the scope of this project, please visit this article at Medium.

ChordSuggester is a computer-aided musical composition system. It is not intended to be a professional tool but just the result of a Master’s thesis covering the whole process for a DataScience project:

Data Acquisition by scraping data from ultimate-guitar.com using Selenium and BeautifulSoup. This part is interesting by itself since there are no examples of clean datasets including chord songs.
Data cleaning and preparation, using Pandas and music21.
Data analysis, using Pandas.
Modelling, using Keras for training an LSTM neural network.
Visualisation of the results on a React Application that consumes the model using TensorFlow.js and shows the results using the music libraries Tone.js and Vexflow. This code is in a separate repo.

Notebooks

There are five notebooks that cover all the needs of the project. Before running them, please, read carefully the whole readme. The notebooks are, in order:

Scraping - Extracting filter criteria. Extracts the filter criteria (genre, style and decade) to be used by the next notebook.
Scraping - Extracting songs.. Extracts the songs (name, decade, url, genre, chords...)
Feature extraction. Feature engineering over the dataset extracted by the previous notebook.
Model. Trains an LSTM to predict the most probable chords after a given chord sequence.
Exporting model to Javascript. Some utilities to export dictionaries from Python to Javascript.

The rest of the notebooks (name starting with DRAFT_) have been used to inspect data, explore different approaches, etc. They do not have to be run, but they could be interesting to see the development process.

Installing libraries

The easiest way to execute this project is by installing the last version of Anaconda. Most libraries used by this project are included by this distribution.

Once installed, there are three options to install the rest of required libraries:

Install only libraries not included in Conda (automatic way) by executing:

pip install -r src/requirements.txt

Install only libraries not included in Conda by executing:

pip install "pytest==5.3.2"
pip install "selenium==3.141.0"
pip install "music21==5.7.0"
pip install "beautifulsoup4==4.8.2"

Create a Conda environment by using:

conda create --name <env> --file src/requirements-conda.txt

Installing Selenium

Scraping notebooks (see 01 - Scraping - Extracting filter criteria.ipynb and 02 - Scraping - Extracting songs.ipynb) need Chrome Driver to be installed from here and copied (unzipped) to the same folder as the notebook (src folder). In the repo, my version is copied, but it could not work on your computer. The driver must be compatible with the installed Chrome version.

In MacOS, you must additionally allow MacOS to run non-known apps: open System Preferences and click Security & Privacy. Change Allow apps downloaded from to Anywhere.

Showing sheets on notebooks

In order to make show() function work when using music21 on Notebooks, any music engraving software (such as Finale, Sibelius or MuseScore) has to be installed.

I recommend MuseScore because it is for free, open source, easy to install and lightweight.

Converting model created from Python Keras to TensorFlow.js format in a Conda Environment

TensorFlow.js is required but please, stop and don't write ~pip install tensorflowjs~ because it could break your Anaconda installation (it was my case...).

The reason is that it requires Python 3.6.8 to work and recent Anaconda distributions have a higher version.

1. Install Python 3.6.8 in a virtual environment:

To force Python 3.6.8 in your local project, you can install pyenv and proceed as follows in the target directory:

pyenv install 3.6.8
pyenv local 3.6.8

Now, you can create and activate a venv virtual environment in your current folder:

virtualenv --no-site-packages venv
. venv/bin/activate

2. Install the TensorFlow.js pip package:

 pip install tensorflowjs

3. Run (from command line) the converter script provided by the pip package:

In this case, our models have HDF5 format.

tensorflowjs_converter \
    --input_format=keras \
    /tmp/my_keras_model.h5 \
    /tmp/my_tfjs_model

Note that the input path used above is a sub-folder generated automatically by Keras when it saved a tf.keras model in the ModelCheckpoint layer.

The output folder will contain a .json file ready to be copied to frontend public folder.

In order to easily convert all the generated models and copy them to frontend directory, two scripts are provided. They can be found in model folder, where all the generated models are saved:

convert-all-models.sh: converts all the models (files with .h5 extensions) to TensorFlow.js models (a folder starting with tfjs_model). To execute them, type this code from a terminal:

sh convert-all-models.sh

copy-models-to-frontend.sh: copies all the TensorFlow.js models (folders) into public/models folder of frontend project. This requires frontend repository to be under the same folder as this repository. To execute from a terminal type:

sh copy-models-to-frontend.sh

.py files

To avoid errors and improve codebase quality, some functions have been extracted from notebooks and included in .py files. All these files have this pattern as name: jl_xxx.py. This allows:

Reuse function in different notebooks.
Test this functions. This is important in a DataScience project, where much time is wasted discovering errors or, even worse, where hidden errors are creating misbehaviours in production.

jl_pychord

Pychord is a nice library to manage musical chords in Python. It does not have all the necessary functionality, so its repo has been cloned an modified here. This is technical debt: the right action would have been to fork the repo, add the necessary documentation and even create a PR for asking the author to merge it.

Testing

In requirements.txt file, pytest is included. It is a unit test library.

Most .py files are covered by tests.

To run the tests, once pytest is installed, write the following from src folder in terminal:

pytest

Models

In model folder, several models have been exported in both h5 and TensorFlow.js formats. Some of them can be tests that do not work at all.

The most accurate models (the ones used in front-end demo) are tfjs_model_lstm_normalised__W_20_lr_0_0005_epochs=50_batch_128.h5 and tfjs_model_lstm_normalised__W_20_lr_0_001_epochs=50_batch_128.h5.

chord-suggester
chord-suggester copied to clipboard

Metadata

Chord Suggester

Notebooks

Installing libraries

Installing Selenium

Showing sheets on notebooks

Converting model created from Python Keras to TensorFlow.js format in a Conda Environment

.py files

jl_pychord

Testing

Models

← Metadata

Owner

Metadata

chord-suggester chord-suggester copied to clipboard

Metadata

Chord Suggester

Notebooks

Installing libraries

Installing Selenium

Showing sheets on notebooks

Converting model created from Python Keras to TensorFlow.js format in a Conda Environment

.py files

jl_pychord

Testing

Models

← Metadata

Owner

Metadata

chord-suggester
chord-suggester copied to clipboard