librosa icon indicating copy to clipboard operation
librosa copied to clipboard

NaN values from LPC implementation at high LPC order

Open Phuriches opened this issue 2 years ago • 2 comments

Describe the bug An output obtained from the LPC function shows NaNs when using a higher LPC order like 40 with the example provided in https://librosa.org/doc/main/generated/librosa.lpc.html. This problem may be caused from using Numba’s JIT compiler for optimization.

To Reproduce

import librosa
import scipy
import numpy as np
import matplotlib.pyplot as plt

sr = None # orginal sampling rate
order = 40 # LPC order

y, sr = librosa.load(librosa.ex('libri1'), sr=sr, duration=5)
print(f'wave length: {len(y)}')
print(f'sampling rate: {sr} Hz')

lpc_org = librosa.lpc(y, order=order)
print(f'lpc array: {lpc_org}')
print(f'lpc shape: {lpc_org.shape}')
b = np.hstack([[0], -1 * lpc_org[1:]])
y_hat = scipy.signal.lfilter(b, [1], y)

fig, ax = plt.subplots()
ax.plot(y)
ax.plot(y_hat, linestyle='--')
ax.legend(['y', 'y_hat'])
ax.set_title('LP Model Forward Prediction');

output:

wave length: 110250
sampling rate: 22050 Hz
lpc array: [ 1. nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
 nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
 nan nan nan nan nan]
lpc shape: (41,)

image

Expected behavior Without the JIT compiler, it seems that we could get the expected LPC array.

from utils import lpc as lpc_without_jit

y, sr = librosa.load(librosa.ex('libri1'), sr=sr, duration=5)
print(f'wave length: {len(y)}')
print(f'sampling rate: {sr} Hz')

lpc_fixed = lpc_without_jit(y, order=order)
print(f'lpc array: {lpc_fixed}') 
print(f'lpc shape: {lpc_fixed.shape}')
b = np.hstack([[0], -1 * lpc_fixed[1:]])
y_hat = scipy.signal.lfilter(b, [1], y)

fig, ax = plt.subplots()
ax.plot(y)
ax.plot(y_hat, linestyle='--')
ax.legend(['y', 'y_hat'])
ax.set_title('LP Model Forward Prediction');

output:

wave length: 110250
sampling rate: 22050 Hz
lpc array: [  1.          -4.714679    12.092463   -22.750685    34.42267
 -43.064598    44.181843   -35.062294    16.7864       5.499286
 -24.326952    32.907856   -28.375805    13.304849     5.3035507
 -19.183268    22.753494   -15.54834      2.0972812   10.495347
 -16.214157    13.081286    -3.6138062   -6.8647375   13.14599
 -12.690356     6.4904933    1.8296452   -8.187643    10.07067
  -7.3736496    1.9960763    3.4351702   -6.927907     7.7992835
  -6.5942974    4.442374    -2.380179     0.9796509   -0.288178
   0.04925828]
lpc shape: (41,)

image

Software versions*

Linux-5.15.0-67-generic-x86_64-with-glibc2.31
Python 3.10.10 (main, Mar 21 2023, 18:45:11) [GCC 11.2.0]
NumPy 1.24.3
SciPy 1.10.1
librosa 0.10.0.post2
INSTALLED VERSIONS
------------------
python: 3.10.10 (main, Mar 21 2023, 18:45:11) [GCC 11.2.0]

librosa: 0.10.0.post2

audioread: 3.0.0
numpy: 1.24.3
scipy: 1.10.1
sklearn: 1.2.2
joblib: 1.2.0
decorator: 5.1.1
numba: 0.57.0
soundfile: 0.12.1
pooch: v1.6.0
soxr: 0.3.5
typing_extensions: installed, no version number available
lazy_loader: installed, no version number available
msgpack: 1.0.5

numpydoc: None
sphinx: None
sphinx_rtd_theme: None
matplotlib: 3.7.1
sphinx_multiversion: None
sphinx_gallery: None
mir_eval: None
ipython: None
sphinxcontrib.rsvgconverter: None
pytest: None
pytest_mpl: None
pytest_cov: None
samplerate: None
resampy: None
presets: None
packaging: 23.1

Phuriches avatar Aug 09 '23 01:08 Phuriches

Interesting, thanks for the thorough report. It definitely smells like numerical underflow.

Can you check if you still get this issue if the input signal is in float64 format instead of float32? (You can do this by saying dtype=np.float64 when loading the signal.)

bmcfee avatar Aug 09 '23 12:08 bmcfee

Quick follow-up, now that I'm back on a proper computer. I tried the above example with float64 format, and it indeed works correctly:

wave length: 110250
sampling rate: 22050 Hz
lpc array: [  1.          -4.71516293  12.09485601 -22.75681033  34.43338165
 -43.07836799  44.19407855 -35.06638442  16.77655389   5.5244911
 -24.3623252   32.94244647 -28.39704634  13.30435169   5.32552459
 -19.21720199  22.78459487 -15.56335598   2.09073033  10.51883742
 -16.24249484  13.10109408  -3.61684028  -6.87816967  13.16809423
 -12.71038931   6.50011478   1.83293082  -8.20037027  10.08608018
  -7.38530683   2.00052831   3.43773793  -6.93480838   7.80720865
  -6.60083495   4.44658485  -2.38231029   0.98047271  -0.28840149
   0.04929236]
lpc shape: (41,)

Since it also works with jit compilation disabled, I think we can safely say that this is upstream behavior due to the compiler optimizer. There may be some part of the code that we could restructure to be more numerically stable in reduced precision, but nothing immediately jumps out to me.

bmcfee avatar Aug 09 '23 13:08 bmcfee