PaddleSpeech issues

Results 289 PaddleSpeech issues

Sort by recently updated

请问语音合成的时候如何增加停顿

比如我需要倒计时3、2、1 每次在中间停顿1秒钟这种，我应该如何在输入文本里面增加标记让合成语音的时候能够按照要求停顿

nine-city

Question

请问phone_map_id 的特殊 id 的意义在哪里查看？

感谢百忙之中查看这个问题。我使用了 examples\zh_en_tts\tts3 的模型，我想写一个自定义的代码来直接生成音素ID数列。我发现里面有个 phone_id_map.txt 的文件，应该记录了音素对应ID表。但有一些看上去不是拼音或者英文发音符号的关系对，我找不到详细的说明。请问里面如： ~~~ 0 1 p 271 q 272 r 273 s 274 sh 275 sil 276 sp 277 spl 278 spn 279...

yzznw

Question

Format not recognised

While trying to upload a wav file per Webrequest im getting the following error {"error":"Error opening '/tmp/Goodevening.wav': Format not recognised."} my api_service.py looks like from flask import Flask, request, jsonify...

Phalanx-01

Question

[TTS]merge_yi function's bug

For support and discussions, please use our [Discourse forums](https://github.com/PaddlePaddle/DeepSpeech/discussions). If you've found a bug then please create an issue with the following information: **Describe the bug** merge_yi这个方法的实现有问题，如果一句话中出现了多个组合的话，会有溢出错误。 **To Reproduce**...

lanyuer

Bug

T2S

[S2T]argparse.ArgumentError: argument --audio_file: conflicting option string: --audio_file

Traceback (most recent call last): File "/raid/ASR/paddlespeech/PaddleSpeech/paddlespeech/s2t/exps/deepspeech2/bin/test_wav.py", line 174, in parser.add_argument("--audio_file", type=str, help='audio file path') File "/home/shj/miniconda3/envs/Paddlenv/lib/python3.8/argparse.py", line 1386, in add_argument return self._add_action(action) File "/home/shj/miniconda3/envs/Paddlenv/lib/python3.8/argparse.py", line 1749, in _add_action self._optionals._add_action(action)...

xdchuan011209

Bug

S2T

语音转文字时，最大支持多长时间

当语音时长为1分47秒时程序报错，并且直接当掉了。 Token indices sequence length is longer than the specified maximum sequence length for this model (515 > 513). Running this sequence through the model will result in indexing errors已放弃(吐核)...

xxch

Bug

S2T

关于speaker diarization问题

## General Question /PaddleSpeech/paddlespeech/vector/exps/ecapa_tdnn/ ，想使用这个开源的sv0_ecapa_tdnn_voxceleb12_ckpt_0_1_1模型测试自己的数据，应该怎么实现呢？该工程下只能对下面这个开源的数据集进行测试， ![image](https://github.com/PaddlePaddle/PaddleSpeech/assets/22997054/f1222cea-fed9-4804-bef9-4df3681976a7) ![image](https://github.com/PaddlePaddle/PaddleSpeech/assets/22997054/61f408de-2c10-44df-bed4-95ed743f3fb2)

Heavenbest

Question

版本问题参考：https://github.com/PaddlePaddle/PaddleSpeech/issues/3528

版本问题参考：https://github.com/PaddlePaddle/PaddleSpeech/issues/3528 _Originally posted by @zxcd in https://github.com/PaddlePaddle/PaddleSpeech/issues/3607#issuecomment-1846487548_ 你好，按这个https://github.com/PaddlePaddle/PaddleSpeech/issues/3528 提供的两个版本，安装后是可以解决ImportError: cannot import name 'sequence_mask' from 'paddle.fluid.layers'问题，但是新问题又来 ![image](https://user-images.githubusercontent.com/1428540/290734186-bad01f1b-ae05-4fe5-a713-deffadeb99b0.png)

sunqinbo

[TTS] 使用paddle 2.5.1版本时 transformer tts 模型推理有问题

1. [transformer tts](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/ljspeech/tts1) 2. 使用paddle2.4.2+(develop or r1.4.1)，进行transformer tts预训练模型推理的结果语音正常 3. 使用paddle2.5.1+(develop or 1.4.1)，进行transformer tts预训练模型推理的结果语音非正常(发音不清楚，断断续续) 不清楚是不是paddlespeech没有适配到2.5.1版本还是其他问题环境： ```bash Package Version --------------------------- --------------- absl-py 2.0.0 aiohttp 3.8.5 aiosignal 1.3.1 annotated-types 0.5.0 antlr4-python3-runtime 4.9.3...

layne01291

Bug

T2S

训练结果能够听出电流声

我使用了 examples 下 tts_finetune/tts3 训练, 发现更换了多个质量比较高的语料或调整了训练步数, 生成的结果能从语音里面听出电流声. 不知道哪个环节的调整能够对抑制电流声有帮助, 希望能够得到大家的建议🙏

chinawangyu

Question

PaddleSpeech
PaddleSpeech copied to clipboard

Metadata

请问语音合成的时候如何增加停顿

请问phone_map_id 的特殊 id 的意义在哪里查看？

Format not recognised

[TTS]merge_yi function's bug

[S2T]argparse.ArgumentError: argument --audio_file: conflicting option string: --audio_file

语音转文字时，最大支持多长时间

关于speaker diarization问题

版本问题参考：https://github.com/PaddlePaddle/PaddleSpeech/issues/3528

[TTS] 使用paddle 2.5.1版本时 transformer tts 模型推理有问题

训练结果能够听出电流声

← Metadata

Owner

Metadata

PaddleSpeech PaddleSpeech copied to clipboard

Metadata

← Metadata

Owner

Metadata

PaddleSpeech
PaddleSpeech copied to clipboard