discord.py Support for receiving audio from voice channels

This PR is based off #6507. I made a new branch and PR for reasons outlined in this comment on my old PR.

Summary

This pull request provides an implementation of receiving and processing audio packets from discord.

This implementation of receiving audio mimics the design of how the library plays audio. Starting from the user's perspective, VoiceClient.listen is called to begin the process. The function takes in an AudioSink object as well as three optional arguments. The three optional arguments are decode, supress_warning, and after. supress_warning is self-explanatory and I'll touch on the other two later. Just as VoiceClient.play takes an AudioSource object and "plays" it, VoiceClient.listen takes a Sink object and "listens." The sink object and other two optional arguments are sent to an AudioReceiver object, just as an audio source object is sent to an AudioPlayer object. AudioReceiver is a thread that is in charge of receiving audio.

The AudioReceiver class is similar to the AudioPlayer class, but it does not do any of the audio work. The AudioReceiver class creates a subprocess to handle the audio processing, which allows for it to be isolated from the main process. The reason for isolating it is to prevent the main process from being overworked and ultimately failing. AudioReceiver first receives raw data from VoiceClient.recv_audio, and then uses a pipe to send audio data to the subprocess. VoiceClient.recv_audio returns audio if there is any to receive, but otherwise returns nothing. AudioReceiver, after sending audio, waits to receive the processed data and then passes it to the sink object within thread. Once listening stops for whatever reason, cleanup is done and the after function is called.

The class that takes care of audio processing is AudioUnpacker. When AudioPacket is instantiated, it's given the info it needs to do audio processing (mode, secret_key, and decode). This class runs in a subprocess and waits to receive raw audio from the main process. Once it receives raw audio it sends it off to AudioUnpacker.unpack_audio_packet, which will first decrypt the packet. Once it's decrypted, it can be determined whether the packet is an audio packet or an rtcp packet. If it's an rtcp packet then it's immediately returned. Otherwise, it checks 1. if it's a silent frame 2. if it decode is true. If it's a silent frame, then it returns nothing. If it's meant to decode the audio packet then it uses OpusDecoder to decode it. Now that the packet is fully processed, it is sent back to the main process.

Lastly, DiscordVoiceWebSocket uses the SPEAKING event to keep track of which ssrc correlates to which user. It provides functionality for AudioReceiver to fill in the AudioFrame.user attribute.

Possible problems

The biggest problem with this is that discord does not officially support receiving audio. It works, however, it's not guaranteed to be bug-free and production-safe. An example of this is a part of the code that I had to comment out due to discord sending what I'm pretty sure are invalid RTP packets. As such, there's no telling when discord could push breaking changes without any warning for audio receive features.

In addition, the only RTCP packet that I was actually able to test was the RTCPReceiverReportPacket since I don't think discord sends any other RTCP packets. I'm also fairly certain that the RTCP Receiver packet discord sends is invalid as well, in that it indicated there was 1 Receiver Report Block, while the sent data did not resemble any report blocks. Maybe I messed something up, but I examined the data quite closely and referred to reliable sources on the structure of the RTCP Receiver Packet.

If this pull request is held off on due to reasons related to discord not supporting audio receive, then I'll continue to keep this code up to date until discord does actually support audio receive (if ever).

Testing the feature

I've created a new file in the examples directory named "basic_voice_listening.py"

If you wanna contact me then just dm me on discord: Sheppsu#5460

Checklist

[x] If code changes were made then they have been tested.
- [x] I have updated the documentation to reflect the changes.
[ ] This PR fixes an issue.
[x] This PR adds something new (e.g. new method or parameters).
[ ] This PR is a breaking change (e.g. methods or parameters removed/renamed)
[ ] This PR is not a code change (e.g. documentation, README, ...)

Mar 05 '23 07:03 Sheppsu

First, thank you a lot for your integration, voice recording is highly requested by discord.py users (including me) and I really hope that this commit will be merged. There seems to be some issue in the code, I write this basic implementation:

import discord
from discord import app_commands


DEV_GUILD = discord.Object(id=913766363791757353)

class Client(discord.Client):
    def __init__(self, *, intents: discord.Intents):
        super().__init__(intents=intents)
        self.tree = app_commands.CommandTree(self)

    async def setup_hook(self):
        self.tree.copy_global_to(guild=DEV_GUILD)
        await self.tree.sync(guild=DEV_GUILD)

intents = discord.Intents.default()
bot = Client(intents=intents)


@bot.event
async def on_ready():
    print(f'Logged in as {bot.user} (ID: {bot.user.id})')
    print('------')


@bot.tree.command()
async def start(interaction: discord.Interaction):
    """Start listening."""

    vc = interaction.user.voice
        
    if not vc:
        return await interaction.response.send_message('You\'re not in a vc right now')

    voice_client = discord.utils.get(bot.voice_clients, guild=interaction.guild)
    if voice_client and voice_client.channel.id != vc.channel.id:
        await voice_client.move_to(vc.channel)
    else:
        voice_client = await vc.channel.connect()

    voice_client.listen(discord.MP3AudioFileSink(output_dir='/tmp'), after=on_listening_stopped)
    await interaction.response.send_message(f'Started listening')


@bot.tree.command()
async def stop(interaction: discord.Interaction):
    """Stop listening."""

    voice_client = discord.utils.get(bot.voice_clients, guild=interaction.guild)
    if not voice_client:
        return await interaction.response.send_message(f'I am not connected to a voice channel.')

    if not voice_client.is_listening():
        return await interaction.response.send_message("Not currently listening")
    
    voice_client.stop_listening()
    await voice_client.disconnect()
 
    await interaction.response.send_message(f'No longer listening.')


@bot.tree.command()
async def pause_listening(interaction: discord.Interaction):
    """Pause listening."""

    voice_client = discord.utils.get(bot.voice_clients, guild=interaction.guild)
    if not voice_client:
        return await interaction.response.send_message(f'I am not connected to a voice channel.')

    if not voice_client.is_listening():
        return await interaction.response.send_message("Not currently listening")
    
    if voice_client.is_listening_paused():
        return await interaction.response.send_message("Listening already paused")
    
    voice_client.pause_listening()
    await interaction.response.send_message("Listening has been paused")


@bot.tree.command()
async def resume_listening(interaction: discord.Interaction):
    """Resume listening."""

    voice_client = discord.utils.get(bot.voice_clients, guild=interaction.guild)

    if not voice_client:
        return await interaction.response.send_message(f'I am not connected to a voice channel.')

    if not voice_client.is_listening():
        return await interaction.response.send_message("Not currently listening")
    
    if not voice_client.is_listening_paused():
        return await interaction.response.channel.send("Already resumed")
    
    voice_client.resume_listening()
    await interaction.response.send_message("Listening has been resumed")


def on_listening_stopped(sink, exc=None):
    sink.convert_files()


bot.run("TOKEN")

When you try to record with /start and after close the record with /stop, you have an error:

Traceback (most recent call last):
  File "/home/jourdelune/.local/lib/python3.11/site-packages/discord/sink.py", line 741, in run
    self._do_run()
  File "/home/jourdelune/.local/lib/python3.11/site-packages/discord/sink.py", line 731, in _do_run
    packet = self.client.recv_audio_packet(dump=not self._resumed.is_set())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jourdelune/.local/lib/python3.11/site-packages/discord/voice_client.py", line 842, in recv_audio_packet
    raise err[0]
TypeError: exceptions must derive from BaseException

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jourdelune/.local/lib/python3.11/site-packages/discord/sink.py", line 754, in _call_after
    self.after(self.sink, error)
  File "/home/jourdelune/Bureau/Interaction/Bot/test.py", line 100, in on_listening_stopped
    sink.convert_files()
  File "/home/jourdelune/.local/lib/python3.11/site-packages/discord/sink.py", line 613, in convert_files
    self.output_files[ssrc] = self.convert_file(f, new_name)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jourdelune/.local/lib/python3.11/site-packages/discord/sink.py", line 691, in convert_file
    process = subprocess.Popen(args, creationflags=subprocess.CREATE_NO_WINDOW)
                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'subprocess' has no attribute 'CREATE_NO_WINDOW'

You can delete the args 'CREATE_NO_WINDOW'. I share this basic implementation for someone that want test the code^^. Also that can be useful if the callback on_listening_stopped in voice_client.listen(discord.WaveAudioFileSink(output_dir='/tmp'), after=on_listening_stopped) is awaited and return the file because you want maybe send the audio in a channel.

Mar 05 '23 16:03 Jourdelune

First, thank you a lot for your integration, voice recording is highly requested by discord.py users (including me) and I really hope that this commit will be merged. There seems to be some issue in the code, I write this basic implementation:

import discord
from discord import app_commands


DEV_GUILD = discord.Object(id=913766363791757353)

class Client(discord.Client):
    def __init__(self, *, intents: discord.Intents):
        super().__init__(intents=intents)
        self.tree = app_commands.CommandTree(self)

    async def setup_hook(self):
        self.tree.copy_global_to(guild=DEV_GUILD)
        await self.tree.sync(guild=DEV_GUILD)

intents = discord.Intents.default()
bot = Client(intents=intents)


@bot.event
async def on_ready():
    print(f'Logged in as {bot.user} (ID: {bot.user.id})')
    print('------')


@bot.tree.command()
async def start(interaction: discord.Interaction):
    """Start listening."""

    vc = interaction.user.voice
        
    if not vc:
        return await interaction.response.send_message('You\'re not in a vc right now')

    voice_client = discord.utils.get(bot.voice_clients, guild=interaction.guild)
    if voice_client and voice_client.channel.id != vc.channel.id:
        await voice_client.move_to(vc.channel)
    else:
        voice_client = await vc.channel.connect()

    voice_client.listen(discord.MP3AudioFileSink(output_dir='/tmp'), after=on_listening_stopped)
    await interaction.response.send_message(f'Started listening')


@bot.tree.command()
async def stop(interaction: discord.Interaction):
    """Stop listening."""

    voice_client = discord.utils.get(bot.voice_clients, guild=interaction.guild)
    if not voice_client:
        return await interaction.response.send_message(f'I am not connected to a voice channel.')

    if not voice_client.is_listening():
        return await interaction.response.send_message("Not currently listening")
    
    voice_client.stop_listening()
    await voice_client.disconnect()
 
    await interaction.response.send_message(f'No longer listening.')


@bot.tree.command()
async def pause_listening(interaction: discord.Interaction):
    """Pause listening."""

    voice_client = discord.utils.get(bot.voice_clients, guild=interaction.guild)
    if not voice_client:
        return await interaction.response.send_message(f'I am not connected to a voice channel.')

    if not voice_client.is_listening():
        return await interaction.response.send_message("Not currently listening")
    
    if voice_client.is_listening_paused():
        return await interaction.response.send_message("Listening already paused")
    
    voice_client.pause_listening()
    await interaction.response.send_message("Listening has been paused")


@bot.tree.command()
async def resume_listening(interaction: discord.Interaction):
    """Resume listening."""

    voice_client = discord.utils.get(bot.voice_clients, guild=interaction.guild)

    if not voice_client:
        return await interaction.response.send_message(f'I am not connected to a voice channel.')

    if not voice_client.is_listening():
        return await interaction.response.send_message("Not currently listening")
    
    if not voice_client.is_listening_paused():
        return await interaction.response.channel.send("Already resumed")
    
    voice_client.resume_listening()
    await interaction.response.send_message("Listening has been resumed")


def on_listening_stopped(sink, exc=None):
    sink.convert_files()


bot.run("TOKEN")

When you try to record with /start and after close the record with /stop, you have an error:

Traceback (most recent call last):
  File "/home/jourdelune/.local/lib/python3.11/site-packages/discord/sink.py", line 741, in run
    self._do_run()
  File "/home/jourdelune/.local/lib/python3.11/site-packages/discord/sink.py", line 731, in _do_run
    packet = self.client.recv_audio_packet(dump=not self._resumed.is_set())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jourdelune/.local/lib/python3.11/site-packages/discord/voice_client.py", line 842, in recv_audio_packet
    raise err[0]
TypeError: exceptions must derive from BaseException

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jourdelune/.local/lib/python3.11/site-packages/discord/sink.py", line 754, in _call_after
    self.after(self.sink, error)
  File "/home/jourdelune/Bureau/Interaction/Bot/test.py", line 100, in on_listening_stopped
    sink.convert_files()
  File "/home/jourdelune/.local/lib/python3.11/site-packages/discord/sink.py", line 613, in convert_files
    self.output_files[ssrc] = self.convert_file(f, new_name)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jourdelune/.local/lib/python3.11/site-packages/discord/sink.py", line 691, in convert_file
    process = subprocess.Popen(args, creationflags=subprocess.CREATE_NO_WINDOW)
                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'subprocess' has no attribute 'CREATE_NO_WINDOW'

You can delete the args 'CREATE_NO_WINDOW'. I share this basic implementation for someone that want test the code^^. Also that can be useful if the callback on_listening_stopped in voice_client.listen(discord.WaveAudioFileSink(output_dir='/tmp'), after=on_listening_stopped) is awaited and return the file because you want maybe send the audio in a channel.

The issue here is that this argument only exists and works on Windows platforms. The PR should include a system check before executing.

~~The python docs do not mention this strangely.~~ Woops, I stand corrected!

Mar 05 '23 18:03 AbstractUmbra

The python docs do not mention this strangely.

They actually do. CREATE_NO_WINDOW is specifically under the windows constants section, and where mentioned prior to this, it's in reference to the also windows only STARTUPINFO

Mar 05 '23 18:03 mikeshardmind

I'll work on redoing the current example file with the critique I've gotten from everyone.

Mar 06 '23 02:03 Sheppsu

Is it preferable that the example be structured similarly to the basic_voice.py example or instead use a slash command implementation? Could also simply scrap the example all together if it's not really needed.

Mar 06 '23 06:03 Sheppsu

The basic voice example has been write over a year ago so maybe you can implement slash command. Also I have found an another issue:

ERROR:discord.sink:Calling the after function failed.
Traceback (most recent call last):
  File "/home/jourdelune/.local/lib/python3.11/site-packages/discord/sink.py", line 756, in _call_after
    self.after(self.sink, error)
  File "/home/jourdelune/Bureau/Interaction/Bot/commands/transcribe.py", line 41, in on_listening_stopped
    sink.convert_files()
  File "/home/jourdelune/.local/lib/python3.11/site-packages/discord/sink.py", line 613, in convert_files
    new_name = f"audio-{user.name}#{user.discriminator}-{ssrc}" if user is not None else None
                        ^^^^^^^^^
AttributeError: 'int' object has no attribute 'name'

user is the id of the user, not a discord.py user object. Here is the code:

from discord.ext import commands
from discord import app_commands
from typing import Literal

import discord
import asyncio


class Transcribe(commands.Cog):
    def __init__(self, bot):
        self.bot = bot


    @app_commands.command(name="transcribe", description="transcribe.describe")
    @app_commands.checks.cooldown(1, 5.0, key=lambda i: (i.user.id))
    @app_commands.describe(style='transcribe.describe.style')
    async def transcribe(self, interaction: discord.Interaction, style: Literal['message', 'embed', 'webhook']):
        """Transcribe voice channel."""
        voice_channel = interaction.user.voice
        
        if voice_channel is None:
            return await interaction.response.send_message(await interaction.translate('transcribe.not_in_voice'))

        voice_channel = interaction.user.voice.channel
        voice_client = discord.utils.get(self.bot.voice_clients, guild=interaction.guild)

        if voice_client is None:
            voice_client = await voice_channel.connect()
        else:
            if voice_channel != voice_client.channel:
                return await interaction.response.send_message(await interaction.translate('transcribe.already'))

        voice_client.play(discord.FFmpegPCMAudio('./sounds/start.mp3'))

        await interaction.response.send_message(await interaction.translate('transcribe.description'))
        voice_client.listen(discord.WaveAudioFileSink(output_dir='/tmp/'), after=self.on_listening_stopped)
        await asyncio.sleep(3)
        voice_client.stop_listening()
        
    def on_listening_stopped(self, sink, exc=None, *, args = None): 
        sink.convert_files() 
        print(sink, exc, args)
    

async def setup(bot):
    await bot.add_cog(Transcribe(bot))

and I don't know if the library already does it but that can be interesting if you add the guild id to the file name, a bot can be in more that one channel at the same time (on different guild).

Mar 06 '23 16:03 Jourdelune

When I look at the code, I don't understand how it can be implemented. In fact, the callback contained in after, called by the listen function, only gets 2 arguments, the sink and a potential exception, I don't think that's enough. For example, if it records a bot discord in a voice channel on two different servers, how do you know when a recording ends from which server it came from...?

Without this information it is difficult to make a useful use of the recording, I think it would be good to implement adding arguments to the callback, this way by adding the server id in the callback arguments it would be possible to know where the recording is coming from and for example send the file to the channel. It might also be useful to specify which user each audio recording is from, although this can be done via the filename in sink.output_files.

Mar 06 '23 22:03 Jourdelune

When I look at the code, I don't understand how it can be implemented. In fact, the callback contained in after, called by the listen function, only gets 2 arguments, the sink and a potential exception, I don't think that's enough. For example, if it records a bot discord in a voice channel on two different servers, how do you know when a recording ends from which server it came from...?

Without this information it is difficult to make a useful use of the recording, I think it would be good to implement adding arguments to the callback, this way by adding the server id in the callback arguments it would be possible to know where the recording is coming from and for example send the file to the channel. It might also be useful to specify which user each audio recording is from, although this can be done via the filename in sink.output_files.

You make good points on user experience with the functionality. I'll make it so that args and kwargs can be passed, and also create a new class specifically for the audio file that can carry information on the user and such.

Mar 06 '23 22:03 Sheppsu

The callback doesn't need to receive that information that way- you could for example, use a sink factory that took that information in from the scope.

Mar 06 '23 22:03 Vexs

The callback doesn't need to receive that information that way- you could for example, use a sink factory that took that information in from the scope.

Could you expand on the idea of a sink factory?

Mar 06 '23 22:03 Sheppsu

Sorry, mixed up some terms and the actual problem itself on my part. Don't need a sink factory- what he's looking for would just be a closure.

Consider the following:

def listen_stopped_closure(self, guild, event)
	def on_listening_stopped(sink, exc=None, *, args = None): 
    	    sink.convert_files() 
                event.set()
    	    whatever etc etc 
       return on_listening_stopped

....
    event = asyncio.Event()
    voice_client.listen(discord.WaveAudioFileSink(output_dir='/tmp/'), after=self.listen_stopped_closure(voice_channel.guild, event))
    await event.wait()
    voice_client.stop_listening()

Mar 06 '23 22:03 Vexs

I have see an another issue:

from discord.ext import commands
from discord import app_commands
from typing import Literal

import discord
import asyncio

class Transcribe(commands.Cog):
    def __init__(self, bot):
        self.bot = bot
        self.listen_guild = {}


    @app_commands.command(name="transcribe", description="transcribe.describe")
    @app_commands.checks.cooldown(1, 5.0, key=lambda i: (i.user.id))
    @app_commands.describe(style='transcribe.describe.style')
    async def transcribe(self, interaction: discord.Interaction, style: Literal['message', 'embed', 'webhook']):
        """Transcribe voice channel."""
        voice_channel = interaction.user.voice
        
        if voice_channel is None:
            return await interaction.response.send_message(await interaction.translate('transcribe.not_in_voice'))

        voice_channel = interaction.user.voice.channel
        voice_client = discord.utils.get(self.bot.voice_clients, guild=interaction.guild)

        if voice_client is None:
            voice_client = await voice_channel.connect()
        else:
            if voice_channel != voice_client.channel:
                return await interaction.response.send_message(await interaction.translate('transcribe.already'))

        self.listen_guild[interaction.guild.id] = Listener()
        
        voice_client.play(discord.FFmpegPCMAudio('./sounds/start.mp3'))
      
        await interaction.response.send_message(await interaction.translate('transcribe.description'))

        voice_client.listen(discord.AudioFileSink(discord.WaveAudioFile, '/tmp/'), after=self.on_listening_stopped, guild=interaction.guild)
        await asyncio.sleep(3)
        voice_client.stop_listening()
    

    async def on_listening_stopped(self, sink, exc, guild): 
        sink.convert_files() # block here
        for file in sink.output_files.values(): 
            print(file)
                
    

async def setup(bot):   
    await bot.add_cog(Transcribe(bot))

In on_listening_stopped, the function sink.convert_files() block everything, I have found that the function convert in WaveAudioFile block the code when it try to read the file:

class WaveAudioFile(AudioFile):
    CHUNK_WRITE_SIZE = 64

    def convert(self, new_name: Optional[str] = None) -> None:
        """Write the raw audio data to a wave file.

        Extends :class:`AudioFile`

        Parameters
        ----------
        new_name: Optional[:class:`str`]
            Name for the wave file excluding ".wav". Defaults to current name if None.
        """

        path = self._get_new_path(self.path, "wav", new_name)
        with wave.open(path, "wb") as wavf:
            wavf.setnchannels(OpusDecoder.CHANNELS)
            wavf.setsampwidth(OpusDecoder.SAMPLE_SIZE // OpusDecoder.CHANNELS)
            wavf.setframerate(OpusDecoder.SAMPLING_RATE)

            print(self.file.read(OpusDecoder.FRAME_SIZE * self.CHUNK_WRITE_SIZE)) # block here

            while frames := self.file.read(OpusDecoder.FRAME_SIZE * self.CHUNK_WRITE_SIZE):
                wavf.writeframes(frames)

        os.remove(self.path)
        file = open(path, "rb")
        file.close()
        self.file = file

I can't suggest fix for the moment because I don't understand the cause, maybe the file is already open (MP3 work fine also).

Mar 07 '23 16:03 Jourdelune

from discord import app_commands

import discord
import asyncio


DEV_GUILD = discord.Object(id=913766363791757353)

class Client(discord.Client):
    def __init__(self, *, intents: discord.Intents):
        super().__init__(intents=intents)
        self.tree = app_commands.CommandTree(self)

    async def setup_hook(self):
        self.tree.copy_global_to(guild=DEV_GUILD)
        await self.tree.sync(guild=DEV_GUILD)

intents = discord.Intents.default()
bot = Client(intents=intents)


@bot.event
async def on_ready():
    print(f'Logged in as {bot.user} (ID: {bot.user.id})')
    print('------')


@bot.tree.command()
async def start(interaction: discord.Interaction):
    """Start listening."""

    vc = interaction.user.voice
        
    if not vc:
        return await interaction.response.send_message('You\'re not in a vc right now')

    voice_client = discord.utils.get(bot.voice_clients, guild=interaction.guild)
    if voice_client and voice_client.channel.id != vc.channel.id:
        await voice_client.move_to(vc.channel)
    else:
        voice_client = await vc.channel.connect()

    voice_client.listen(discord.AudioFileSink(discord.MP3AudioFile, '/tmp/'), after=on_listening_stopped)
    await interaction.response.send_message(f'Started listening')

    await asyncio.sleep(3)

    voice_client.stop_listening()
 
    await interaction.channel.send(f'No longer listening.')


async def on_listening_stopped(sink, exc=None):
    sink.convert_files()
    for file in sink.output_files.values():
        print('try')
        file.file.read() # block here
        print('ok')

bot.run("Token")

Here is an example of the issue with MP3

Mar 07 '23 18:03 Jourdelune

Those problems were caused by a bad oversight from me where read was called on a closed file.

Mar 07 '23 18:03 Sheppsu

It is not totally fixed, indeed file.file.read() also blocks^^.

async def on_listening_stopped(sink, exc=None):
    sink.convert_files()   
    for file in sink.output_files.values():
        print(file.user.id)
        print(file.file.read()) # block

Also if the code running in a cogs, the file.user will be a int but outside a cogs, it will be a user object, shouldn't this be constant? Otherwise it is confusing.

Mar 07 '23 20:03 Jourdelune

It is not totally fixed, indeed file.file.read() also blocks^^.

I find this strange because the file attribute in this case should be None after convert finishes running.

Also if the code running in a cogs, the file.user will be a int but outside a cogs, it will be a user object, shouldn't this be constant? Otherwise it is confusing.

I'm not sure I understand what you mean. The value of file.user depends on the "best" value that can be returned in the audio frames. VoiceClient asks the gateway for a user object associated with the ssrc and gets back either nothing, a user id, or a Member object. It passes that value in the AudioFrame object to the Sink. AudioFile caches that value depending on its currently cached value (e.g. if the current value is a user id and it's given a Member object, it will replace the user id). There's not a way for the value to be constant, though it will almost always return either None or Member.

Mar 07 '23 21:03 Sheppsu

Also if the code running in a cogs, the file.user will be a int but outside a cogs, it will be a user object, shouldn't this be constant? Otherwise it is confusing.

ah right I thought user was something you could control, sorry for that.

I find this strange because the file attribute in this case should be None after convert finishes running.

Indeed, you can test the code by replacing the function on_listening_stopped with a file.file.read in this code if you wish but yes we must use file.path.

Mar 07 '23 21:03 Jourdelune

I believe it's not blocking the code, but actually an exception is occurring and it's not being shown in the console due to the error handling, which calls _log.exception (is this supposed to show in the console by default?). Either way, the user should use file.path to open the file themselves or do whatever. Here's the code that fires the callback (mimicking the function in AudioPlayer)

def _call_after(self) -> None:
    error = self._current_error
    if self.after is not None:
        try:
            kwargs = self.after_kwargs if self.after_kwargs is not None else {}
            asyncio.run_coroutine_threadsafe(self.after(self.sink, error, **kwargs), self.client.client.loop)
        except Exception as exc:
            exc.__context__ = error
            _log.exception('Calling the after function failed.', exc_info=exc)
    elif error:
        _log.exception('Exception in voice thread %s', self.name, exc_info=error)

Mar 07 '23 21:03 Sheppsu

I agree.

Mar 07 '23 21:03 Jourdelune

I'm not sure I understand what you mean. The value of file.user depends on the "best" value that can be returned in the audio frames. VoiceClient asks the gateway for a user object associated with the ssrc and gets back either nothing, a user id, or a Member object. It passes that value in the AudioFrame object to the Sink. AudioFile caches that value depending on its currently cached value (e.g. if the current value is a user id and it's given a Member object, it will replace the user id). There's not a way for the value to be constant, though it will almost always return either None or Member.

Wouldn't it be better to use discord.Object?

Mar 07 '23 22:03 dolfies

Wouldn't it be better to use discord.Object?

I assume you mean replacing int with discord.Object? That works well for the use of discord.Object so I'll consider implementing that.

Mar 07 '23 22:03 Sheppsu

also it's usefull to raise error in the after callback, All the errors are otherwise glossed over, which does not facilitate development.

async def on_listening_stopped(sink, exc=None):
    sink.convert_files() 
    print(5/0) # no error raised

Mar 08 '23 17:03 Jourdelune

There is also a very common silent error, sometimes the converter does not find the file to convert.

import traceback
async def on_listening_stopped(sink, exc=None):
    try:
        sink.convert_files()
        print('ok')
    except Exception:
        traceback.print_exc()

Traceback (most recent call last):
  File "/home/jourdelune/Bureau/Interaction/Bot/test.py", line 61, in on_listening_stopped
    sink.convert_files()
  File "/home/jourdelune/.local/lib/python3.11/site-packages/discord/sink.py", line 707, in convert_files
    file.convert(self._create_name(file))
  File "/home/jourdelune/.local/lib/python3.11/site-packages/discord/sink.py", line 866, in convert
    with open(self.path, "rb") as file:
         ^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/audio-201351.pcm'

Mar 08 '23 19:03 Jourdelune

There is also a very common silent error, sometimes the converter does not find the file to convert.

Do you have any steps for reproducing this? I've never come across this problem, and I'm not sure how this exception could come about in the first place.

Mar 09 '23 06:03 Sheppsu

Here is an example when the exception is held:

from discord import app_commands

import discord
import asyncio


DEV_GUILD = discord.Object(id=913766363791757353)

class Client(discord.Client):
    def __init__(self, *, intents: discord.Intents):
        super().__init__(intents=intents)
        self.tree = app_commands.CommandTree(self)

    async def setup_hook(self):
        self.tree.copy_global_to(guild=DEV_GUILD)
        await self.tree.sync(guild=DEV_GUILD)

intents = discord.Intents.default()
bot = Client(intents=intents)


@bot.event
async def on_ready():
    print(f'Logged in as {bot.user} (ID: {bot.user.id})')
    print('------')


@bot.tree.command()
async def start(interaction: discord.Interaction):
    """Start listening."""

    vc = interaction.user.voice
        
    if not vc:
        return await interaction.response.send_message('You\'re not in a vc right now')

    voice_client = discord.utils.get(bot.voice_clients, guild=interaction.guild)
    if voice_client and voice_client.channel.id != vc.channel.id:
        await voice_client.move_to(vc.channel)
    else:
        voice_client = await vc.channel.connect()

    await interaction.response.send_message(f'Started listening')

    for _ in range(10):
        while voice_client.is_listening():
            pass 

        voice_client.listen(discord.AudioFileSink(discord.WaveAudioFile, './audio'), after=on_listening_stopped)

        await asyncio.sleep(1)

        voice_client.stop_listening()
 
    await interaction.channel.send(f'No longer listening.')


async def on_listening_stopped(sink, exc):
    sink.convert_files()
 

bot.run("Token")

but I have the impression that the error appears randomly, if you change:

def _convert_cleanup(self, new_path: str) -> None:
        os.remove(self.path)
        self.path = new_path
        self.file = None
        self.converted = True

by

def _convert_cleanup(self, new_path: str) -> None:
        self.path = new_path
        self.file = None
        self.converted = True

I search why that happen.

Mar 09 '23 16:03 Jourdelune

Should be fixed, just put something like this inside the loop:

while voice_client.is_listening() or voice_client.is_listen_cleaning():
    await asyncio.sleep(0.1)

Mar 10 '23 04:03 Sheppsu

okay, thanks you for the fix^^. Also I have this issue:

File "/home/jourdelune/.local/lib/python3.11/site-packages/discord/gateway.py", line 953, in received_message
    "user": user if user is not None else Object(id=user_id, type=Member),

because TYPE_CHECKING is False

if TYPE_CHECKING:
    from typing_extensions import Self

    from .client import Client
    from .member import Member
    from .state import ConnectionState
    from .voice_client import VoiceClient

in gateway.

Mar 10 '23 16:03 Jourdelune

Ah, thanks for catching that.

Mar 10 '23 18:03 Sheppsu

Thank you for the fix, but maybe indeed of use

while voice_client.is_listening() or voice_client.is_listen_cleaning():

the voice_client.stop_listening can only end once all this has happened, right? (i.e. only when it has stopped listening and deleted the file).

Mar 11 '23 08:03 Jourdelune

voice_client.stop_listening calls AudioReceiver.stop, which simply sets some threading.Event objects telling the thread that it can end the main loop and cleanup. Requiring voice_client.stop_listening to wait for all that cleanup (which also includes calling the after function) before returning feels like it could create some annoying bottlenecks, so I prefer this method.

Mar 11 '23 09:03 Sheppsu

discord.py discord.py copied to clipboard

Support for receiving audio from voice channels

Summary

Possible problems

Testing the feature

Checklist

discord.py
discord.py copied to clipboard