TwitchIO icon indicating copy to clipboard operation
TwitchIO copied to clipboard

player.play() doesn't correctly handle sound files with non-default format

Open iarspider opened this issue 2 years ago • 13 comments

I'm trying to use the new Sounds ext, and I have noticed that playing mp3 files with only one channel produces weird result - I'd say the file is played at 2x speed, or the pitch is shifted up. Converting the file to stereo fixes the issue.

MediaInfo output for mono file:

General
Complete name                            : Minion General Speech@[email protected]
Format                                   : MPEG Audio
File size                                : 25.3 KiB
Duration                                 : 1 s 619 ms
Overall bit rate mode                    : Constant
Overall bit rate                         : 128 kb/s
Genre                                    : Other

Audio
Format                                   : MPEG Audio
Format version                           : Version 1
Format profile                           : Layer 3
Duration                                 : 1 s 620 ms
Bit rate mode                            : Constant
Bit rate                                 : 128 kb/s
Channel(s)                               : 1 channel
Sampling rate                            : 44.1 kHz
Frame rate                               : 38.281 FPS (1152 SPF)
Compression mode                         : Lossy
Stream size                              : 25.3 KiB (100%)

MediaInfo output for stereo file:

General
Complete name                            : Minion General Speech@[email protected]
Format                                   : MPEG Audio
File size                                : 26.0 KiB
Duration                                 : 1 s 645 ms
Overall bit rate mode                    : Constant
Overall bit rate                         : 128 kb/s
Genre                                    : Other
Writing library                          : LAME3.100

Audio
Format                                   : MPEG Audio
Format version                           : Version 1
Format profile                           : Layer 3
Format settings                          : Joint stereo / MS Stereo
Duration                                 : 1 s 646 ms
Bit rate mode                            : Constant
Bit rate                                 : 128 kb/s
Channel(s)                               : 2 channels
Sampling rate                            : 44.1 kHz
Frame rate                               : 38.281 FPS (1152 SPF)
Compression mode                         : Lossy
Stream size                              : 25.7 KiB (99%)
Writing library                          : LAME3.100

iarspider avatar Nov 12 '22 18:11 iarspider

Hello! Thanks for the issue. If this is a general help question, for a faster response consider joining the official Discord Server

Else if you have an issue with the library please wait for someone to help you here.

github-actions[bot] avatar Nov 12 '22 18:11 github-actions[bot]

I don't have any 5.1 or 7.1 files, but I would guess they will also be handled incorrectly, maybe the pitch will shift down?

iarspider avatar Nov 12 '22 18:11 iarspider

So, a small update: player.play() expects the sound file to be in one specific format: 44.1 kHz, 128 kb/s, 2 channels. If any of the parameters are off, the sound will not be played correctly. I think (but don't have time to test it) that adding

-ab 128 -ar 44100 -ac 2

before pipe:1 in ffmpeg invocation here and maybe here will help.

iarspider avatar Nov 20 '22 10:11 iarspider

can confirm - my mono .mp3s also play super fast

image

until I changed the lines as suggested above:

            self.proc = subprocess.Popen(
                [
                    ffmpeg_bin,
                    "-i",
                    source,
                    "-loglevel",
                    "panic",
                    "-vn",
                    "-f",
                    "s16le",
                    "-ab",
                    "128",
                    "-ar",
                    "44100",
                    "-ac",
                    "2",
                    "pipe:1",
                ],

plomdawg avatar Mar 14 '23 22:03 plomdawg

This is where the sample rate is hard coded. It just appears like this just isnt finished, and local files havnt been fully covered yet. Idk if theres a better way, maybe try to read in sample rate from meta data on audio, but I made a PR to at least allow you to set the values of those things that way we dont have to change the library code anymore lol

sockheadrps avatar Apr 23 '23 19:04 sockheadrps

The actual rate right now for the audio files must be converted to 48000 and not 44100. I don't think this is mentioned in the docs anywhere and is not trivial find this solution.

Is there a problem to merge the PR by @sockheadrps?

enekochan avatar Jul 25 '23 22:07 enekochan

Is there a problem to merge the PR by @sockheadrps?

I forgot it existed :) There's a couple meta issues with the pr, but once they've been fixed it can be merged

IAmTomahawkx avatar Jul 25 '23 22:07 IAmTomahawkx

I think this issue can be closed since #454 addressed the problem in this issue by detecting sample rate and channels from the audio meta data, and exposing both properties with setters just in case the meta data isn’t accurate.

sockheadrps avatar Jun 30 '24 21:06 sockheadrps

Thanks for looking into it. With this update however, I'm getting weird results - if I play two sounds in a row, the second one isn't played, and holds the file open for eternity. I will try to post a reproducer later.

iarspider avatar Jul 24 '24 17:07 iarspider

And here is the reproducer:

import asyncio
import os
import time

import eyed3

from pathlib import Path

from twitchio.ext import commands, sounds


class Bot(commands.Bot):
    def play_sound(self, sound: str):
        soundfile = str(Path(__file__).parent / sound)
        sound = sounds.Sound(soundfile)

        print("play sound", soundfile)

        self.player.play(sound)

        duration = eyed3.load(soundfile).info.time_secs
        time.sleep(duration)
        print(f"slept for {duration}s")

    def __init__(self, initial_channels=None):
        super().__init__(
            token=os.getenv("TWITCH_CHAT_PASSWORD"),
            client_id=os.getenv("TWITCH_CHAT_CLIENT_ID"),
            nick="arachnobot",
            prefix="!",
            initial_channels=["#iarspider"],
        )

        self.player = sounds.AudioPlayer(callback=self.player_done)

    async def event_ready(self):
        print(f"Ready | {self.nick}")
        self.play_sound("ding-sound-effect_1.mp3")
        self.play_sound("ding-sound-effect_1.mp3")


    async def player_done(self):
        print("Player done")
        pass


async def main():
    global client, twitch_bot
    twitch_bot = Bot()
    await twitch_bot.start()


if __name__ == "__main__":
    asyncio.run(main())

(only extra dependency is eyed3 to get sound file duration, can replace with fixed-duration). File used: ding-sound-effect_1.mp3

This outputs:

Ready | arachnobot
play sound e:\Temp\4\ding-sound-effect_1.mp3
slept for 2.95s
play sound e:\Temp\4\ding-sound-effect_1.mp3
slept for 2.95s
Player done

and plays ding only once.

iarspider avatar Jul 26 '24 10:07 iarspider

This worked fine before. The Bot.play_sound function in my "production" code is used also to play temporary Text-to-Speech mp3 files, and the waiting is mostly used to remove those temporary files once they are done playing.

iarspider avatar Jul 26 '24 10:07 iarspider

This one works properly:

import asyncio
import os
import time

from pathlib import Path

# from dotenv import load_dotenv
from twitchio.ext import commands, sounds


class Bot(commands.Bot):
    async def play_sound(self, sound: str):
        soundfile = str(Path(__file__).parent / sound)
        sound = sounds.Sound(soundfile)

        print("wait for lock")
        await self.lock.acquire()
        print("play sound", soundfile)
        self.player.play(sound)

    def __init__(self, initial_channels=None):
        super().__init__(
            token=os.getenv("TWITCH_CHAT_PASSWORD"),
            client_id=os.getenv("TWITCH_CHAT_CLIENT_ID"),
            nick="arachnobot",
            prefix="!",
            initial_channels=["#iarspider"],
        )
        
        self.lock = asyncio.Lock()
        self.player = sounds.AudioPlayer(callback=self.player_done)

    async def event_ready(self):
        print(f"Ready | {self.nick}")
        await self.play_sound("ding-sound-effect_1.mp3")
        await self.play_sound("ding-sound-effect_1.mp3")


    async def player_done(self):
        print("player done")
        self.lock.release()
        pass


async def main():
    global client, twitch_bot
    twitch_bot = Bot()
    await twitch_bot.start()


if __name__ == "__main__":
    # load_dotenv()
    asyncio.run(main())

iarspider avatar Jul 26 '24 10:07 iarspider