SPTK
SPTK copied to clipboard
A suite of speech signal processing tools
SPTK
The Speech Signal Processing Toolkit (SPTK) is a software for speech signal processing tools.
Documentation
See this page for a reference manual.
Requirements
- GCC 4.8.5+ / Clang 3.5.0+ / Visual Studio 2015+
- CMake 3.1+
Installation
Linux / macOS
expand
The latest release can be downloaded through Git. The install procedure is as follows.
git clone https://github.com/sp-nitech/SPTK.git
cd SPTK
mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=.. # Please change install directory.
make -j 4 install # Please change the number of jobs depending on your environment.
Then the SPTK commands can be used by adding bin/ directory to the PATH environment variable.
If you would like to use a part of the SPTK functions, please link the static library lib/libsptk.a.
Windows
expand
You may need to add cmake and MSBuild to the PATH environment variable in advance.
Open Command Prompt and follow the below procedure:
cd /path/to/SPTK # Please change here to your appropriate path.
mkdir build
cd build
cmake .. -DCMAKE_INSTALL_PREFIX=.. # Please change install directory.
MSBuild -maxcpucount:4 /p:Configuration=Release INSTALL.vcxproj
You can compile the programs via GUI instead of running MSBuild.
Then the SPTK functions can be used by linking the static library lib/sptk.lib.
Demonstration
- Analysis-synthesis via mel-cepstrum
- Parametric coding via line spectral pairs
Examples
The SPTK provides some examples.
Go to an example directory and execute run.sh, e.g.,
cd egs/analysis_synthesis/mgc
./run.sh
The below is a simple example that decreases the volume of input audio in input.wav.
You may need to install sox command on your system.
sox -t wav input.wav -c 1 -t s16 -r 16000 - |
x2x +sd | sopr -m 0.5 | x2x +ds -r |
sox -c 1 -t s16 -r 16000 - -t wav output.wav
If you would like to draw figures, please prepare a python environment.
cd tools; make venv; cd ..
. ./tools/venv/bin/activate
impulse -l 32 | gseries impulse.png
deactivate
Changes from SPTK3
- Input and output types are changed to double from float
- Signal processing classes are written in C++ instead of C
- Drawing commands are implemented in Python
- No memory leaks
- Thread-safe
- New features:
- Conversion from/to log area ratio (
lar2parandpar2lar) - Entropy calculation (
entropy) - Huffman coding (
huffman,huffman_encode, andhuffman_decode) - Magic number interpolation (
magic_intpl) - Median filter (
medfilt) - Mel-cepstrum postfilter (
mcpf) - Mel-filter-bank extraction (
fbank) - Nonrecursive MLPG (
mlpg -R 1) - Pitch extraction by DIO used in WORLD (
pitch -a 3) - Pole-zero plot (
gpolezero) - Scalar quantization (
quantizeanddequantize) - Spectrogram plot (
gspecgram) - Stability check of LPC coefficients (
lpccheck) - Subband decomposition (
pqmfandipqmf) - Windows build support (only static library)
- Conversion from/to log area ratio (
- Obsoleted commands:
acep,agcep, andamcep->amgcepbellc2sp->mgc2spcat2andecho2dads,us,us16, anduscd->soxfiggc2gc->mgc2mgcgcep,mcep, anduels->mgcepglsadf,lmadf, andmlsadf->mglsadfivqandvq->imsvqandmsvqlsp2sp->mglsp2spmgc2mgclspandmgclsp2mgcpsgrandxgrraw2wav,wav2raw,wavjoin, andwavsplit->sox
- Separated commands:
c2ir->c2mpirandmpir2cdtw->dtwanddtw_mergemglsadf->mglsadfandimglsadftrain->trainandmsequlaw->ulawandiulawvstat->vstatandmedian
- Renamed commands:
mgclsp2sp->mglsp2sp
Overview

Who we are
- Keiichi Tokuda - Produce and Design - Nagoya Institute of Technology
- Keiichiro Oura - Nagoya Institute of Technology
- Takenori Yoshimura - Main Maintainer - Nagoya Institute of Technology
- Takato Fujimoto - Nagoya Institute of Technology
Contributors to former versions of SPTK
- Akira Tamamori
- Cassia Valentini
- Chiyomi Miyajima
- Fernando Gil Resende Junior
- Gou Hirabayashi
- Heiga Zen
- Junichi Yamagishi
- Keiichi Tokuda
- Keiichiro Oura
- Kenji Chiba
- Masatsune Tamura
- Naohiro Isshiki
- Noboru Miyazaki
- Satoshi Imai
- Shinji Sako
- Tadashi Kitamura
- Takao Kobayashi
- Takashi Masuko
- Takashi Nose
- Takato Fujimoto
- Takayoshi Yoshimura
- Takenori Yoshimura
- Toru Takahashi
- Toshiaki Fukada
- Toshihiko Kato
- Toshio Kanno
- Yoshihiko Nankaku
License
This software is released under the Apache License 2.0.