EmotiVoice 生成的语音开头有啪嗒的声音

用的是api生成的语音片段。并不是每个生成的语音片段都有这样的啪嗒的声音，但是有不少语音片段头部，有啪嗒的一声，或者哒的一声，就像电流啪嗒一样的声音，这是什么原因？你们有这样吗？

Jan 27 '24 04:01 zsanjin-p

Could you please provide more details about this issue, such as the specific text, speaker ID, and audio samples?

Jan 29 '24 02:01 syq163

我也遇到了，speaker ID换成啥都不行，请帮忙看看什么问题，音频例子如下 response.zip

Mar 21 '24 11:03 lh7343

Could you please provide more details about this issue, such as the specific text, speaker ID, and audio samples?

我也遇到了，speaker ID换成啥都不行，请帮忙看看什么问题，音频例子如下 response.zip

Mar 21 '24 11:03 lh7343

When using the webpage-based demo by running streamlit run demo_page.py, the generated audio contains no noise. However, I do notice noise at the beginning of the sample audio. Can you please provide more details about this issue?

Mar 26 '24 03:03 syq163

我用的是api的方式。以下是我的docker run命令 docker run --gpus "device=3" -d --name EmotiVoice -p 28021:8000 -v /raid/liuhao/EmotiVoice:/workspace/EmotiVoice -w /workspace/EmotiVoice/EmotiVoice emoti-voice:v1 env LANG=C.UTF-8 sh -c "uvicorn openaiapi:app --reload --host 0.0.0.0 --port 8000 >> log/all.log 2>&1"

When using the webpage-based demo by running streamlit run demo_page.py, the generated audio contains no noise. However, I do notice noise at the beginning of the sample audio. Can you please provide more details about this issue?

Apr 02 '24 08:04 lh7343

我也遇到了，speaker ID换成啥都不行，请帮忙看看什么问题，音频例子如下 response.zip

import os
from pydub import AudioSegment
import logging

# Set up logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

def remove_or_silence_noise_from_audio_files(directory, noise_duration_ms, mode):
    # Determine the output folder for processed audio files
    output_folder = os.path.join(directory, "Processed_Audio")
    if not os.path.exists(output_folder):
        os.makedirs(output_folder)
        logging.info(f"Folder created: {output_folder}")

    # Get all audio files
    audio_files = [file for file in os.listdir(directory) if file.endswith(('.mp3', '.wav'))]
    logging.info(f"Found {len(audio_files)} audio files.")

    # Initialize statistics variables
    success_count = 0
    fail_count = 0
    failed_files = []

    # Process each file
    for file in audio_files:
        file_path = os.path.join(directory, file)
        try:
            # Load the audio
            audio = AudioSegment.from_file(file_path)
            logging.info(f"Processing audio file: {file_path}")

            if mode == 1:
                # Remove noise from the beginning of the audio for noise_duration_ms milliseconds
                processed_audio = audio[noise_duration_ms:]
            elif mode == 2:
                # Create a silence segment and replace the beginning noise_duration_ms milliseconds with it
                silence = AudioSegment.silent(duration=noise_duration_ms)
                processed_audio = silence + audio[noise_duration_ms:]

            # Save the new audio file
            new_file_path = os.path.join(output_folder, file)
            processed_audio.export(new_file_path, format=file[-3:])
            logging.info(f"Processed audio file saved to: {new_file_path}")
            success_count += 1
        except Exception as e:
            logging.error(f"Error processing audio file {file_path}: {e}")
            fail_count += 1
            failed_files.append((file_path, str(e)))

    # Log the results
    logging.info(f"Processing complete. Success: {success_count}, Failures: {fail_count}")
    if fail_count > 0:
        logging.info("Failed files and reasons:")
        for file, error in failed_files:
            logging.info(f"File: {file}, Error: {error}")

if __name__ == "__main__":
    # User inputs the processing time, default is 100ms
    try:
        noise_duration_ms = int(input("Enter the noise processing time (ms, default 100ms): ") or "100")
    except ValueError:
        print("Invalid input, using default value of 100ms")
        noise_duration_ms = 100
    
    # User chooses the processing mode
    try:
        mode = int(input("Choose the mode (1: Remove beginning noise, 2: Replace beginning noise with silence): "))
        if mode not in [1, 2]:
            raise ValueError("Invalid mode, must be 1 or 2")
    except ValueError as ve:
        print(ve)
        mode = int(input("Please re-enter the correct mode (1 or 2): "))
    
    # Call the function to process audio files in the current directory
    remove_or_silence_noise_from_audio_files(os.getcwd(), noise_duration_ms, mode)

Apr 20 '24 17:04 zsanjin-p