SPTK

The Speech Signal Processing Toolkit (SPTK) is a software for speech signal processing tools.

Older version: SPTK3
PyTorch version: diffsptk

Documentation

See this page for a reference manual.

Requirements

GCC 4.8.5+ / Clang 3.5.0+ / Visual Studio 2015+
CMake 3.1+

Installation

Linux / macOS

expand

The latest release can be downloaded through Git. The install procedure is as follows.

git clone https://github.com/sp-nitech/SPTK.git
cd SPTK
mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=..  # Please change install directory.
make -j 4 install  # Please change the number of jobs depending on your environment.

Then the SPTK commands can be used by adding bin/ directory to the PATH environment variable. If you would like to use a part of the SPTK functions, please link the static library lib/libsptk.a.

Windows

expand

You may need to add cmake and MSBuild to the PATH environment variable in advance. Open Command Prompt and follow the below procedure:

cd /path/to/SPTK  # Please change here to your appropriate path.
mkdir build
cd build
cmake .. -DCMAKE_INSTALL_PREFIX=..  # Please change install directory.
MSBuild -maxcpucount:4 /p:Configuration=Release INSTALL.vcxproj

You can compile the programs via GUI instead of running MSBuild. Then the SPTK functions can be used by linking the static library lib/sptk.lib.

Demonstration

Twitter
Analysis-synthesis via mel-cepstrum
Parametric coding via line spectral pairs

Examples

The SPTK provides some examples. Go to an example directory and execute run.sh, e.g.,

cd egs/analysis_synthesis/mgc
./run.sh

The below is a simple example that decreases the volume of input audio in input.wav. You may need to install sox command on your system.

sox -t wav input.wav -c 1 -t s16 -r 16000 - |
    x2x +sd | sopr -m 0.5 | x2x +ds -r |
    sox -c 1 -t s16 -r 16000 - -t wav output.wav

If you would like to draw figures, please prepare a python environment.

cd tools; make venv; cd ..
. ./tools/venv/bin/activate
impulse -l 32 | gseries impulse.png
deactivate

Changes from SPTK3

Input and output types are changed to double from float
Signal processing classes are written in C++ instead of C
Drawing commands are implemented in Python
No memory leaks
Thread-safe
New features:
- Conversion from/to log area ratio (lar2par and par2lar)
- Entropy calculation (entropy)
- Huffman coding (huffman, huffman_encode, and huffman_decode)
- Magic number interpolation (magic_intpl)
- Median filter (medfilt)
- Mel-cepstrum postfilter (mcpf)
- Mel-filter-bank extraction (fbank)
- Nonrecursive MLPG (mlpg -R 1)
- Pitch extraction by DIO used in WORLD (pitch -a 3)
- Pole-zero plot (gpolezero)
- Scalar quantization (quantize and dequantize)
- Spectrogram plot (gspecgram)
- Stability check of LPC coefficients (lpccheck)
- Subband decomposition (pqmf and ipqmf)
- Windows build support (only static library)
Obsoleted commands:
- acep, agcep, and amcep -> amgcep
- bell
- c2sp -> mgc2sp
- cat2 and echo2
- da
- ds, us, us16, and uscd -> sox
- fig
- gc2gc -> mgc2mgc
- gcep, mcep, and uels -> mgcep
- glsadf, lmadf, and mlsadf -> mglsadf
- ivq and vq -> imsvq and msvq
- lsp2sp -> mglsp2sp
- mgc2mgclsp and mgclsp2mgc
- psgr and xgr
- raw2wav, wav2raw, wavjoin, and wavsplit -> sox
Separated commands:
- c2ir -> c2mpir and mpir2c
- dtw -> dtw and dtw_merge
- mglsadf -> mglsadf and imglsadf
- train -> train and mseq
- ulaw -> ulaw and iulaw
- vstat -> vstat and median
Renamed commands:
- mgclsp2sp -> mglsp2sp

Overview

diagram

Who we are

Keiichi Tokuda - Produce and Design - Nagoya Institute of Technology
Keiichiro Oura - Nagoya Institute of Technology
Takenori Yoshimura - Main Maintainer - Nagoya Institute of Technology
Takato Fujimoto - Nagoya Institute of Technology

Contributors to former versions of SPTK

Akira Tamamori
Cassia Valentini
Chiyomi Miyajima
Fernando Gil Resende Junior
Gou Hirabayashi
Heiga Zen
Junichi Yamagishi
Keiichi Tokuda
Keiichiro Oura
Kenji Chiba
Masatsune Tamura
Naohiro Isshiki
Noboru Miyazaki
Satoshi Imai
Shinji Sako
Tadashi Kitamura
Takao Kobayashi
Takashi Masuko
Takashi Nose
Takato Fujimoto
Takayoshi Yoshimura
Takenori Yoshimura
Toru Takahashi
Toshiaki Fukada
Toshihiko Kato
Toshio Kanno
Yoshihiko Nankaku

License

This software is released under the Apache License 2.0.

SPTK
SPTK copied to clipboard

Metadata

SPTK

Documentation

Requirements

Installation

Linux / macOS

Windows

Demonstration

Examples

Changes from SPTK3

Overview

Who we are

Contributors to former versions of SPTK

License

← Metadata

Owner

Metadata

SPTK SPTK copied to clipboard

Metadata

SPTK

Documentation

Requirements

Installation

Linux / macOS

Windows

Demonstration

Examples

Changes from SPTK3

Overview

Who we are

Contributors to former versions of SPTK

License

← Metadata

Owner

Metadata

SPTK
SPTK copied to clipboard