diffsptk
diffsptk copied to clipboard
A differentiable version of SPTK
diffsptk
diffsptk is a differentiable version of SPTK based on the PyTorch framework.
Requirements
- Python 3.8+
- PyTorch 1.10.0+
Documentation
See this page for a reference manual.
Installation
The latest stable release can be installed through PyPI by running
pip install diffsptk
Alternatively,
git clone https://github.com/sp-nitech/diffsptk.git
pip install -e diffsptk
Examples
Mel-cepstral analysis
import diffsptk
import torch
# Generate waveform.
x = torch.randn(100)
# Compute STFT of x.
stft = diffsptk.STFT(frame_length=12, frame_period=10, fft_length=16)
X = stft(x)
# Estimate 4-th order mel-cepstrum of x.
mcep = diffsptk.MelCepstralAnalysis(cep_order=4, fft_length=16, alpha=0.1, n_iter=1)
mc = mcep(X)
Mel-spectrogram extraction
import diffsptk
import torch
# Generate waveform.
x = torch.randn(100)
# Compute STFT of x.
stft = diffsptk.STFT(frame_length=12, frame_period=10, fft_length=32)
X = stft(x)
# Apply 4 mel-filter banks to the STFT.
fbank = diffsptk.MelFilterBankAnalysis(n_channel=4, fft_length=32, sample_rate=8000, floor=1e-1)
Y = fbank(X)
Subband decomposition
import diffsptk
import torch
K = 4 # Number of subbands.
M = 40 # Order of filter.
# Generate waveform.
x = torch.randn(100)
# Decompose x.
pqmf = diffsptk.PQMF(K, M)
decimate = diffsptk.Decimation(K)
y = decimate(pqmf(x), dim=-1)
# Reconstruct x.
interpolate = diffsptk.Interpolation(K)
ipqmf = diffsptk.IPQMF(K, M)
x_hat = ipqmf(interpolate(K * y, dim=-1))
# Compute error between two signals.
error = torch.abs(x_hat - x).sum()
Status
~~module~~ will not be implemented in this repository.
- [x] acorr
- [ ] ~~acr2csm~~
- [ ] ~~aeq~~ (torch.allclose)
- [ ] ~~amgcep~~
- [ ] ~~average~~ (torch.mean)
- [x] b2mc
- [ ] ~~bcp~~ (torch.split)
- [ ] ~~bcut~~
- [x] c2acr
- [x] c2mpir
- [x] c2ndps
- [x] cdist
- [ ] ~~clip~~ (torch.clip)
- [ ] ~~csm2acr~~
- [x] dct
- [x] decimate
- [x] delay
- [x] delta
- [x] dequantize
- [x] df2
- [x] dfs
- [ ] ~~dmp~~
- [ ] ~~dtw~~
- [ ] ~~dtw_merge~~
- [ ] ~~entropy~~ (torch.special.entr)
- [ ] ~~excite~~
- [ ] ~~extract~~
- [x] fbank
- [ ] ~~fd~~
- [ ] ~~fdrw~~
- [ ] ~~fft~~ (torch.fft.fft)
- [ ] ~~fft2~~ (torch.fft.fft2)
- [x] fftcep
- [ ] ~~fftr~~ (torch.fft.rfft)
- [ ] ~~fftr2~~ (torch.fft.rfft2)
- [x] frame
- [x] freqt
- [ ] ~~glogsp~~
- [ ] ~~gmm~~
- [ ] ~~gmmp~~
- [x] gnorm
- [ ] ~~gpolezero~~
- [ ] ~~grlogsp~~
- [x] grpdelay
- [ ] ~~gseries~~
- [ ] ~~gspecgram~~
- [ ] ~~gwave~~
- [ ] ~~histogram~~ (torch.histogram)
- [ ] ~~huffman~~
- [ ] ~~huffman_decode~~
- [ ] ~~huffman_encode~~
- [x] idct
- [ ] ~~ifft~~ (torch.fft.ifft)
- [ ] ~~ifft2~~ (torch.fft.ifft2)
- [x] ignorm
- [ ] imglsadf (will be appeared)
- [x] impulse
- [x] imsvq
- [x] interpolate
- [x] ipqmf
- [x] iulaw
- [x] lar2par
- [ ] ~~lbg~~
- [x] levdur
- [x] linear_intpl
- [x] lpc
- [ ] ~~lpc2c~~
- [ ] ~~lpc2lsp~~
- [x] lpc2par
- [x] lpccheck
- [ ] ~~lsp2lpc~~
- [ ] ~~lspcheck~~
- [ ] ~~lspdf~~
- [ ] ~~ltcdf~~
- [x] mc2b
- [x] mcpf
- [ ] ~~median~~ (torch.median)
- [ ] ~~merge~~ (torch.cat)
- [x] mfcc
- [x] mgc2mgc
- [x] mgc2sp
- [x] mgcep
- [ ] mglsadf (will be appeared)
- [ ] ~~mglsp2sp~~
- [ ] ~~minmax~~
- [x] mlpg (support only unit variance)
- [ ] ~~mlsacheck~~
- [x] mpir2c
- [ ] ~~mseq~~
- [ ] ~~msvq~~
- [ ] ~~nan~~ (torch.isnan)
- [x] ndps2c
- [x] norm0
- [ ] ~~nrand~~ (torch.randn)
- [x] par2lar
- [x] par2lpc
- [x] pca
- [ ] ~~pcas~~
- [x] phase
- [x] pitch
- [ ] ~~pitch_mark~~
- [ ] ~~poledf~~
- [x] pqmf
- [x] quantize
- [x] ramp
- [ ] ~~reverse~~ (torch.flip)
- [ ] ~~rlevdur~~
- [x] rmse
- [ ] ~~root_pol~~
- [x] sin
- [x] smcep
- [x] snr
- [x] sopr
- [x] spec
- [x] step
- [ ] ~~swab~~
- [ ] ~~symmetrize~~
- [ ] ~~train~~
- [ ] ~~transpose~~ (torch.transpose)
- [x] ulaw
- [ ] ~~vc~~
- [ ] ~~vopr~~
- [ ] ~~vstat~~ (torch.var_mean)
- [ ] ~~vsum~~ (torch.sum)
- [x] window
- [ ] ~~x2x~~
- [x] zcross
- [x] zerodf
License
This software is released under the Apache License 2.0.