lakh-pianoroll-dataset
lakh-pianoroll-dataset copied to clipboard

Published 20 hours ago •

→

Metadata

A collection of 174,154 multi-track piano-rolls

Readme
Issues

Source Code for Deriving Lakh Pianoroll Dataset (LPD)

The derived dataset using the default settings is available here.

Download Lakh MIDI Dataset (LMD) with the following script.
```
./scripts/download_lmd.sh
```
(Or, download it manually here.)
Set the variables LMD_ROOT and LPD_ROOT in run.sh and variables in config.py to proper values.
Derive all subsets and versions of LPD, matched_ids.txt and cleansed_ids.txt with the following script.
```
./scripts/derive_lpd.sh
```

Derive the labels for the LPD

The derived labels can be found at data/labels.tar.gz.

Download the labels with the following script.
```
./scripts/download_labels.sh
```
Derive the labels with the following script.
```
./scripts/derive_labels.sh
```

Synthesize audio files for the LPD

Install GNU Parallel to run the synthesizer in parallel mode.
Synthesize audio files from multitrack pianorolls with the following script.
```
./scripts/batch_synthesize.sh ./data/lpd/lpd/lpd_cleansed/ \
  ./data/synthesized/lpd_cleansed 20
```
(The above command will synthesize all the multitrack pianorolls in the LPD-cleansed subset with 20 parallel jobs.)

About

A collection of 174,154 multi-track piano-rolls

music

music-generation

music-information-retrieval

multitrack

pianoroll

78

Stars

9

Forks

Watchers

Owner

← Metadata

78

Stars

9

Forks

Watchers

Owner

Metadata

A collection of 174,154 multi-track piano-rolls