FunASR icon indicating copy to clipboard operation
FunASR copied to clipboard

ffmpeg处理pcm格式音频

Open wwfcnu opened this issue 1 year ago • 0 comments

Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)

🐛 Bug

` def _load_audio_ffmpeg(file: str, sr: int = 16000): """ Open an audio file and read as mono waveform, resampling as necessary

Parameters
----------
file: str
    The audio file to open

sr: int
    The sample rate to resample the audio if necessary

Returns
-------
A NumPy array containing the audio waveform, in float32 dtype.
"""

# This launches a subprocess to decode audio while down-mixing
# and resampling as necessary.  Requires the ffmpeg CLI in PATH.
# fmt: off
cmd = [
    "ffmpeg",
    "-nostdin",
    "-threads", "0",
    "-i", file,
    "-f", "s16le",
    "-ac", "1",
    "-acodec", "pcm_s16le",
    "-ar", str(sr),
    "-"
]
# fmt: on
try:
    out = run(cmd, capture_output=True, check=True).stdout
except CalledProcessError as e:
    raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e

return np.frombuffer(out, np.int16).flatten().astype(np.float32) / 32768.0`

如果输入文件是pcm格式会报错

wwfcnu avatar Sep 14 '24 07:09 wwfcnu