node-mumble Mix OutputStreams

If there are multiple InputStreams mix them.

May 09 '15 14:05 Prior99

Isn't there one? Connection.outputStream() without user session parameter should output the mixed audio, doesn't it?

May 10 '15 17:05 Rantanen

Sorry, I always confuse Input- and OutputStream in this project. I meant InputStream. When writing to two instances of the connections InputStreams, this should be mix, for example for music playback and playing a sound at the same time.

May 10 '15 21:05 Prior99

Yeah, I was wondering this at some point but I figured that "can't be bothered implementing it at this point. The users can always do it themselves".. ^^

Might end up writing a simple mixing stream implementation that I can wrap stuff in.

May 10 '15 21:05 Rantanen

Would be great. Would it be by tghhe way possible to send audio to a channel and/or whisper (multiple) users at once? As currently this is interferring.

May 11 '15 04:05 Prior99

That's a bit tricky due to the Mumble protocol. Each audio packet requires a sequence number.

Currently what happens is:

Whisper to A: Packet 1
Whisper to B: Packet 2
Whisper to A: Packet 3  // A thinks packet 2 was lost
Whisper to B: Packet 4  // B thinks packet 3 was lost
etc.

If we do:

Whisper to A: Packet 1
Whisper to B: Packet 1
Whisper to A: Packet 2
Whisper to B: Packet 2
Whisper to A: Packet 3
Whisper to A: Packet 4

we'll run into complications when we mix normal talking to this as those packets then need to share the sequence for the different users. And I believe users are coded to ignore packets with sequence number less than what they last played.

May 11 '15 04:05 Rantanen

Do they have to start from 0? What happens when we do this:

Whisper to A: Packet 9001
Whisper to Channel: Packet 1
Whisper to A: Packet 9002
Whisper to Channel: Packet 2
Whisper to A: Packet 9003
Whisper to Channel: Packet 3

May 11 '15 07:05 Prior99

There is a way to whisper to multiple users or channels at once. Would it make sense when wanting to whisper to multiple users AND channels simply collect the users in the targeted channels and add the list of targeted users? That way we simply whisper to multiple users which is supported by the protocol.

May 11 '15 09:05 bendem

The problem is that the usual case is probably whispering DIFFERENT things to multiple users/channels. And then alternating between whispering different things and whispering same things.

This could be done by creating an individual voice target for each individual user and then handling what goes to whom in the library code instead of letting murmur/MumbleClient take care of mixing, but this limits us to 30 users as that's the maximum amount of voice channels we can register.

May 11 '15 10:05 Rantanen

30 different users at once is better than 1 at once imo.

May 11 '15 11:05 Prior99

I'd imagine inability to broadcast to more than 30 users at the same point would be a major shortcoming? Note this would need to include normal talking as well since that uses sequence numbers too.

May 11 '15 15:05 Rantanen

I'd imagine inability to broadcast to more than 30 users at the same point would be a major shortcoming?

Not only that, but the server includes all of your audio streams in the same bandwidth counter. This means that audio quality will be severely impaired, due to the reduced amount of per-client bandwidth.

May 11 '15 15:05 ghost

Yeah. I think the solution to this issue is keeping track of the highest seuqence number sent out and tagging the new voice transmissions with that. Then while a voice transmission is ongoing, keep incrementing the sequence number by one for every packet sent during that voice transmission.

The major difficulty with this one is that the library needs to differentiate between different voice transmissions. At the moment it's just outputting audio to the server without caring about the transmissions.

May 11 '15 16:05 Rantanen

This is harddddd... :<

Whispering to multiple users was actually easy compared to mixing multiple input streams together before sending them to the server.

The problem with this is timing: Current input streams receive whatever audio is sent to them and then push them to the server. There's no timing issues here. Now if we want to mix two input streams together we'd need to match their content together.

When one stream pushes 1024 samples in, we can't just push these to the server as we need to give the other stream a chance to push some samples as well. If the other stream won't push anything then we need to figure out a point when we'll give up and just push the samples from the one stream.

I think the solution would be to rewrite sendVoiceFrame to queue frames to an internal buffer and add an audio send interval to the MumbleConnection that dequeues frames, mixes then and then sends them.

Although I'm a bit afraid of any delays in the audio. We already have some issues when it comes to sending audio to the client where the client will exhaust it's buffer while playing it back before the jitter buffer will generate enough buffer.

May 15 '15 13:05 Rantanen

(Oh, and multiple concurrent whispers was implemented in 19ef5c65883400febb3671f86872d14691ecba3c)

May 15 '15 13:05 Rantanen