djl
djl copied to clipboard
Incorrect Normalization Factor for 16-bit PCM Audio in grab Method
Description
In the grab method, the normalization of 16-bit PCM samples is performed using:
list.add(buffer.get() / (float) Short.MAX_VALUE);
However, Short.MAX_VALUE is 32767, while the actual range of 16-bit PCM samples is [-32768, 32767]. This causes:
1.Asymmetry in normalization – The positive range is [-1, 1], but -32768 / 32767 ≈ -1.00003, slightly exceeding -1. 2.Potential value overflow – Some models expecting values strictly within [-1,1] might experience issues.
Expected Behavior
The normalization should ensure that all values strictly remain in the [-1,1] range.
How to Reproduce?
Steps to reproduce
(Paste the commands you ran that produced the error.)
- Process a WAV file with 16-bit PCM samples using the grab method.
- Observe that the minimum value slightly exceeds -1.
What have you tried to solve it?
Use 32768.0f instead of Short.MAX_VALUE for normalization:
Additional Context:
This issue affects models that expect perfectly normalized audio input, such as WebRTC VAD.
@leleZeng Indeed this is bug. Would you mind create a PR to fix it?
Fixed in PR #3646