gpuRIR
gpuRIR copied to clipboard
Impulse responces of gpuRIR very different compared with pyroomacoutics with similar settings
Hi,
while trying to obtain similar RIRs with gpuRIR and pyroomacoustics, I alwayse obtain very different filters for similar acoustic settings: it looks like the filters generated with gpuRIR are ~ one order of magnitude smaller. Con you give me any hints?
Thank you very much
Hi Fra,
Could you provide a minimal example where this behavior can be observed and, if possible, a plot of the RIRs obtained with it?
Best regards, David
Here is the filter generated with gpuRIR:
Here the filter from pyroomacoustics:
The room dimensions were set to: [4.585, 6.903, 3.144]
For generating the absorption coefficients and the max number or reflections I used in both of the cases the Sabine formula (available in both of the packages) with T60=0.6.
Here I could already see something weird: from pyroom I get absorption_coeff=0.19714386610054616
instead gpuRIR implementation of the Sabine formula gives it a value of 0.8961001
. On the other side, for the max_reflection_order I get 79
from pyroom and [23, 15, 33]
for gpuRIR (71
in total).
Thanks for reporting this. There might be some differences between gpuRIR and pyroomacoustics since gpuRIR uses negative reflection coefficients and (optionally) a random late reverberation model, but this seems to be indeed too much difference. I'll take a look at this during the week.
Best regards, David
Yes I noticed from the paper. In fact I removed the late reverb. part not setting the Tdiff
argument in the simulateRIR
function.
The implementations of the Sabine formula from both libraries are equivalent, the difference is due to gpuRIR returning the energy reflection coefficients while pyroomacoustics returning the energy absorption coefficients. You can see how alpha = 1 - beta²
. Be aware that pyroomacoustics used to work with amplitude coefficients instead of energy coefficients and, if you're using energy coefficients, you must use the parameter material
instead of absortion
(which is deprecated).
About the RIRs from gpuRIR having less amplitude than the ones from pyroomacoustics, the amplitude of the direct path is expected to be A_dp=1/(4*pi*d)
. I don't know the distance between the source and the receiver in your simulation, but in order to get an amplitude about A_dp=0.5
you would need a distance of just d=0.16
meters, which is quite low and, from the shape of your RIRs, it doesn't seem to be your case. I guess the higher amplitude in pyroomacoustics could be due to the low-frequency artifact generated by using positive reflection coefficients, but I can't tell for sure.
A_dp=1/(4*pi*d)
Should be if you use delayed impulses. However I guess that if use the sinc+Hanning window also other images contribute to the direct path, isn't it?
However, here below you can see a plot comparing RIRs from pyroom, gpuRIR and field measured RIRs with similar acoustic parameters from the BUTReverb dataset. On the top of the plot I added the distance between the source and the recording microphone.
PS: about pyroom I see your point and I know about the deprecation but I wanted to compare the parameters calculated with the two tools and the coeffs are reasonably in accordance.
A_dp=1/(4pid) Should be if you use delayed impulses. However I guess that if use the sinc+Hanning window also other images contribute to the direct path, isn't it?
Yeah, that's right. That formula is employed for all the image sources (just using the distance of the image source instead of the original source) and, as you can see, that would generate amplitudes in the same order of magnitude as the ones obtained by gpuRIR. As you say, the use of a windowed sinc function can affect the amplitudes of the different peaks (though it would be even worse if instead of using sinc functions we just rounded the fractional delays) since the side lobes of each peak overlap with the rest of the peaks. Using positive reflections coefficients can make this effect stronger since all the peaks are positive, but I'm not sure whether only this can explain such a big difference.
However, here below you can see a plot comparing RIRs from pyroom, gpuRIR and field measured RIRs with similar acoustic parameters from the BUTReverb dataset. On the top of the plot I added the distance between the source and the recording microphone.
I have to admit I have never compared the results of gpuRIR with field measured RIRs, in order to validate the results of gpuRIR I compared them with the results from the Lehmann's Matlab library. I've run again some Matlab simulations to check if any of the last gpuRIRs updates might have introduced any bug, but everything looks fine:

I don't know how the amplitudes of the BUTReverb dataset have been normalized, since I guess that's not a trivial task. Since usually you only care about the position and the relative amplitudes of each peak but the absolute amplitude doesn't matter (since you usually have to normalize the results after having applied it), maybe they just ensured that the RIRs were never saturated and they were coherent along the dataset.
PS: about pyroom I see your point and I know about the deprecation but I wanted to compare the parameters calculated with the two tools and the coeffs are reasonably in accordance.
If you're using energy absorption coefficients, like the ones provided by pyroomacoustics.inverse_sabine
or 1-beta**2
where beta
is the result from gpuRIR.beta_SabineEstimation
, you must indicate them to pyroomacoustics using materials=pra.Material(e_absorption)
when creating the room object. However, if you want to use amplitude absorption coefficients, like 1-beta
, you must indicate it using absorption=a_absortion
. Mixing them will lead to RIRs with a T60 different than the one expected and may explain why, in your last figures, the RIRs from pyroomacoustics seem to have a much longer T60.
To sum up: I think that the results obtained with gpuRIR are the results that can be expected from the Image Source Method (at least when using negative reflection coefficients) and there's nothing wrong with it. About pyroomacoustics, I don't know if its higher amplitudes can be explained just by the use of positive reflection coefficients or if maybe they're doing some extra processing that can explain it but, in any case, I think it would be better if you ask to its developers.
Best regards, David
Hi,
I am also observing difference between gpuRIR and (rirgen and rir_generator). Please see below and let me know if I am missing anything:
import rirgen
import gpuRIR
import rir_generator
import numpy as np
import matplotlib.pyplot as plt
T60 = 0.413 # seconds
room_dim = [7.875, 5.839, 3.088] # meters
fs = 8000 # Hz
room_sz = room_dim # Size of the room [m]
pos_src = np.array([
[3.810, 1.919, 1.423],
]) # Positions of the sources ([m]
pos_rcv = np.array([
[3.974, 2.979, 1.418],
]) # Position of the receivers [m]
att_diff = 15.0 # Attenuation when start using the diffuse reverberation model [dB]
att_max = 60.0 # Attenuation at the end of the simulation [dB]
beta = gpuRIR.beta_SabineEstimation(room_sz, T60) # Reflection coefficients
Tdiff= gpuRIR.att2t_SabineEstimator(att_diff, T60) # Time to start the diffuse reverberation model [s]
Tmax = gpuRIR.att2t_SabineEstimator(att_max, T60) # Time to stop the simulation [s]
nb_img = gpuRIR.t2n( Tdiff, room_sz ) # Number of image sources in each dimension
gpuRIR_RIRs = gpuRIR.simulateRIR(room_sz, beta, pos_src, pos_rcv, nb_img, Tmax, fs, Tdiff = Tdiff, c = 343)
rir_generator_RIRs = rir_generator.generate(
c=343,
fs=fs,
r=np.ascontiguousarray(pos_rcv),
s=np.ascontiguousarray(pos_src[0]),
L=np.ascontiguousarray(room_dim),
reverberation_time=T60,
mtype=rir_generator.mtype.omnidirectional,
)
rirgen_RIR = rirgen.generate_rir(
room_measures=room_dim,
source_position=pos_src[0],
receiver_positions=np.ascontiguousarray(pos_rcv),
reverb_time=T60,
sound_velocity=343,
fs=fs,
)
plt.plot(gpuRIR_RIRs[0, 0, :], label = "gpuRIR", color = "k")
plt.plot(rir_generator_RIRs[:, 0], label = "rir_generator")
plt.plot(rirgen_RIR[0], label = "rirgen")
plt.xlim(0, 200)
plt.legend()
Hello,
As explained in the paper, gpuRIR
uses negative reflection coefficients, so you can expect about half of the peaks to be negative. Apart from that, you can see how the timing and the absolute value of the peaks are the same for all the libraries.
In the picture, you can also observe how rirgen
has some artifacts after some peaks (e.g. from 25 to 50 ms after the first peak) that are not generated by gpuRIR
.
I think those are the only differences between libraries in the picture.
Best, David