phasen icon indicating copy to clipboard operation
phasen copied to clipboard

音频连接处有哒哒的声音或者消音的情况

Open SongJinXue opened this issue 4 years ago • 3 comments

你号,音频分成4秒每段进行语音增强后,在音频的连接处有哒哒的声音或者会出现消音的情况,将4s改成1s后的效果更加严重,这种情况可以采用什么方式去除呢?产生的原因是因为音频不连续吗?

SongJinXue avatar Sep 15 '20 08:09 SongJinXue

请问你这边数据用的是什么呢?如果你把4s每段去除,即把https://github.com/huyanxin/phasen/blob/31ee2f1ba89b535142a5189abf913e9ac7f36404/steps/run_phasen.py#L196 置成false后会是什么情况呢?最好能发下样例我听听看吧。正常来说不应该出现这种情况才对,本身分段也是做了overlap来保证前后段之间的连续性,按理说不应该出现这种情况才对,样例你发我邮箱[email protected]

huyanxin avatar Sep 16 '20 02:09 huyanxin

请问你这边数据用的是什么呢?如果你把4s每段去除,即把

https://github.com/huyanxin/phasen/blob/31ee2f1ba89b535142a5189abf913e9ac7f36404/steps/run_phasen.py#L196

置成false后会是什么情况呢?最好能发下样例我听听看吧。正常来说不应该出现这种情况才对,本身分段也是做了overlap来保证前后段之间的连续性,按理说不应该出现这种情况才对,样例你发我邮箱[email protected]

将decode_do_segement = flase 效果会好一点,但音频有时候会出现消音的问题,样例已发送到邮箱

SongJinXue avatar Sep 16 '20 06:09 SongJinXue

Cross-fading segments instead of hard-cutting them can alleviate discontinuities. See illustration below.

Hard-cut: bilde

Cross-fade: bilde

Note: This crudely illustrates the concept. Do not actually use the illustrated curve for cross-fading. Rather use equal-power cross fading, maybe sqrt.

iver56 avatar Sep 30 '20 07:09 iver56