beep icon indicating copy to clipboard operation
beep copied to clipboard

Oto v2.0 has been release :)

Open mewmew opened this issue 2 years ago • 46 comments

Oto v2.0 was just released :) https://github.com/hajimehoshi/oto/releases/tag/v2.0.0

  • New APIs
    • Accepting io.Reader instead of io.Writer to create a player
    • Richer Player interface
  • New implementations
    • Much more buffers to avoid clicking noises

This issue tracks the work to update beep to use the latest version of Oto, bring with it performance increases.

mewmew avatar Aug 28 '21 20:08 mewmew

Some notes:

  • The new Oto player requests samples instead of that we push samples to it.
  • UnplayedBufferSize() could potentially give us some fine-grained control over the buffer, however:
    • Oto's internal buffer seems to be 1/4th or 1/2nd of a second long: link
    // oneBufferSize returns the size of one buffer in the player implementation.
    func (c *context) oneBufferSize() int {
     bytesPerSample := c.channelNum * c.bitDepthInBytes
     s := c.sampleRate * bytesPerSample / 4
    
     // Align s in multiples of bytes per sample, or a buffer could have extra bytes.
     return s / bytesPerSample * bytesPerSample
    }
    
    // maxBufferSize returns the maximum size of the buffer for the audio source.
    // This buffer is used when unreading on pausing the player.
    func (c *context) maxBufferSize() int {
     // The number of underlying buffers should be 2.
     return c.oneBufferSize() * 2
    }
    
    • It will only add the player after the whole buffer is filled and it will request samples until it has done so or it receives an EOF. This is fine if you want to play a song but for game sounds this is an unacceptable delay.
    f !p.eof {
    buf := p.ensureTmpBuf()
    for len(p.buf) < p.context.maxBufferSize() {
    	n, err := p.src.Read(buf)
    	if err != nil && err != io.EOF {
    		p.setErrorImpl(err)
    		return
    	}
    	p.buf = append(p.buf, buf[:n]...)
    	if err == io.EOF {
    		if len(p.buf) == 0 {
    			p.eof = true
    		}
    		break
    	}
    }
    
    
    f !p.eof || len(p.buf) > 0 {
    p.state = playerPlay
    
    
    .m.Unlock()
    .players.addPlayer(p)
    .m.Lock()
    
    • After the player is added we can return however many samples we like. But it will keep requesting samples until the buffer is filled so that may use unnecessary cpu power.
  • I would like it if we could check which buffers on our end we really need and if they do potentially add any delays if they aren't filled completely. I want to reduce any delays we can.
  • It could be an option to remove beep.Mixer from the beep.Player implementation by adding a Oto player for each streamer. This would remove another buffer on our side. Another benefit is that the shared buffer doesn't need to be drained before an additional sound can be played. For example, for background music in a game it's fine to have >0.5sec buffered. But when you want to play an attack sound at the same time, you don't want to wait before the shared buffer is drained. When adding a new player, we need to provide 0.5sec of samples of that sound still, but we don't have to wait on the shared buffer to start playing. This may or may not make the speaker slightly unpredictable if you want to play 2 sounds at the exact same time. We could try to minimize this effect when implementing, but if it is not possible the user could always just use a mixer themselves.

I may be wrong on some things here. There are a lot of small details to oversee.

I would like to hear other people's thoughts/wishes on this as well.

I may try to implement Oto 2 with Beep. If it takes too long, feel free to work on this yourself. I can't spend loads of time on the computer right now.

MarkKremer avatar Aug 29 '21 10:08 MarkKremer

Wow, really thorough review of the Oto release. Thanks @MarkKremer!

I'm surprised to see there are additional delays with the new API, seeing as @hajimehoshi is also the author of Ebiten and use Oto for game development themselves.

@hajimehoshi would you be able to help us clarify the performance improvements you've seen with the new release with regards to games? Also, have you seen any issues with delays as mentioned above?

Thanks for working on this! Both Oto and Beep are really great for game development in Go :)

Cheers, Robin

mewmew avatar Aug 29 '21 10:08 mewmew

I want to clarify a bit. The problem is with how Beep and Oto work(ed) together. Oto basically has its own mixer that adds samples of each player together. Each player has a big buffer (which is good because you don't want to run out of samples). It's just that if we add our own mixer in front of it that that we have to wait for all samples currently in the buffer to be played before any new samples are played if we add a new streamer.

I think this could actually work out quite nicely, but we have to make some changes on our end. :)

Great work @hajimehoshi and contributors! I haven't tested it yet but having each player buffer individually and then combining the buffers once the platform-specific code requests it seems promising.

MarkKremer avatar Aug 29 '21 11:08 MarkKremer

Hi,

This is fine if you want to play a song but for game sounds this is an unacceptable delay.

I don't think so. It depends on environments, but the audio library's buffer size should be very much smaller. The delay is determined by the audio library's buffer size, not by Oto players' buffer sizes.

For example, on macOS, OS's buffer size is 2048 [bytes] (= 0.005 [s] in 48000 [Hz] stereo)

https://github.com/hajimehoshi/oto/blob/main/driver_macos.go

would you be able to help us clarify the performance improvements you've seen with the new release with regards to games?

  • On browsers, Oto v2 uses WebAudio for mixing while Oto v1 did it by itself.
  • On Android, Oto v2 uses Oboe (C++) while Oto v1 AudioTrack (Java).
  • Oto v2's player has at most 0.5[s] buffer to avoid underrun, while Oto v1 didn't have so much buffer. In Oto v1's design, it was impossible to have such buffer since users might want to adjust e.g., volumes in real time and Oto v1 doesn't have such API.

EDIT: From this, you can no longer update your data at io.Reader in real time, as the data change is reflected 0.5[s] after its change. I think this might require to change the current Beep's implementation.

Also, have you seen any issues with delays as mentioned above?

I've not realized any additional delays. Rather, v2 should improve latencies especially on Android. Please inform me if you find actual delays. I appreciate all the contributions.

Thanks,

hajimehoshi avatar Aug 29 '21 11:08 hajimehoshi

It's just that if we add our own mixer in front of it that that we have to wait for all samples currently in the buffer to be played before any new samples are played if we add a new streamer.

Ah right, if you do mixing on your side, the buffer sizes of Oto's players would matter.

hajimehoshi avatar Aug 29 '21 11:08 hajimehoshi

I've made a first draft. Some notes:

  • This will break backwards compatibility:
    • Close() will be removed, as it isn't part of Oto anymore and players can be closed individually.
    • Removed bufferSize param from speaker.Init().
    • Removed Lock() and Unlock(). Speakers can be controlled individually, the mixer isn't used anymore and most, if not all, Oto's functions are thread safe.
  • I kept func Play(s ...beep.Streamer) for some backwards compatibility but I recommend using func NewPlayer(s beep.Streamer) Player. The Player struct gives more control over the player like playing/pausing playback and volume. Because of the internal buffer of the player, this will respond more quickly than when doing the same with Beep's Ctrl/Volume decorators.
  • I'm considering if Play() should add a mixer if multiple streamers are passed to it. This will guarantee that the streamers are played simultaneously. I think downsides are minimal. Alternatively, the user can do this themselves if needed.
  • NewPlayer() could also accept multiple streamers and use a mixer. It will then return a single player. Or is this overengineering?
  • I'm considering if I want to add some global functions to close or clear all players. This will either require us to keep track of the speakers, or add the functions to Oto's Context object. I think I prefer the last option, or that the user keeps track of the players themselves. The downside of the first options is that this may leave the user with dangling Player objects that don't work anymore.
  • Besides UnplayedBufferSize() int, I've added SamplesPlayed() int and DurationPlayed() time.Duration to Player. I hope they will be relatively accurate.

Besides that, I'm still working on updating the examples to use the new speaker and make use of the player where useful.

MarkKremer avatar Sep 01 '21 14:09 MarkKremer

Besides that, I'm still working on updating the examples to use the new speaker and make use of the player where useful.

Incredible work @MarkKremer! Excited to take it for a spin, will be great to see if this gives performance improvements in a game I'm playing with. Had some issues with delays a while back.

mewmew avatar Sep 01 '21 19:09 mewmew

Hey, sorry for arriving to this issue so late, thanks @MarkKremer for working on this, it would be great to update to the new Oto. However, I have to stress that the philosophy of Beep has to be kept in tact. To spell it out concretely, I think this is one of the fundamental pillars of Beep's philosophy:

  • All loading, generating, mixing, or modifying the sound in any way is done purely in software, via the Streamer interface. The final result, a pure single stream of samples, is then somehow pushed to the speaker.

Why is this important? Several reasons:

  • It makes Beep extremely composable. You can literally combine anything with anything and keep combining.
  • Everything can be not just played, but also saved to a file, streamed via a network, whatever you want. The moment you start depending on a playback library to do the mixing, this goes out of the window.

Now, as @hajimehoshi said, if we do our own mixing (and we do!), then Oto's buffer size starts to matter because of latency. If this is still true, then I'm not sure we can upgrade because low latency is important in games and is actually one of the great features of Beep. This needs to be resolved before updating.

Anyway, thanks for the great work, I just wanted to clarify this.

faiface avatar Sep 17 '21 20:09 faiface

You could argue that we can of course still do our own mixing in software, but if we want latency-free playback, then we have to fall back on Oto's capabilities. This is also not good. If we have two methods of mixing and one of them is latency-free but not composable, and the other one introduces latency but composes well, then we simply have two methods which are both bad.

faiface avatar Sep 17 '21 20:09 faiface

For example, what about letting an Oto's player have a callback function that is called just before pushing bytes/floats to the drivers? It's just like Audio Worklet or ScriptProcessorNode and you would be able to modify bytes/floats in real time.

hajimehoshi avatar Sep 17 '21 21:09 hajimehoshi

@hajimehoshi Yes, that would be great! Btw, I think with this functionality, it would be possible to upgrade to Oto 2 without breaking any backwards compatibility, which would be ideal.

faiface avatar Sep 17 '21 21:09 faiface

Actually, without changing any of Beep's API.

faiface avatar Sep 17 '21 21:09 faiface

OK let me think. The implementation should not be difficult. Of course, I welcome your designs and/or PRs!

hajimehoshi avatar Sep 17 '21 21:09 hajimehoshi

Although now that I think about it, I'm not sure this would be sufficient. We would be able to modify the samples right before pushing, but you would still push 1/2s or 1/4s of samples at once, meaning the next batch of samples could only be added/modifies after this time has passed. This would still keep the latency there.

faiface avatar Sep 17 '21 21:09 faiface

This might be extreme, but would using 'zero' io.Reader that emits zeros and mixing everything by yourself at the callback work for Beep?

hajimehoshi avatar Sep 17 '21 21:09 hajimehoshi

That would be the idea, but it would only work if the callback was called many times per second. In fact, the latency would be precisely proportional to that number (called N times per second implies latency of 1/N seconds). Would that be possible?

faiface avatar Sep 17 '21 21:09 faiface

Yeah for example calling the callback for each 100 samples = 480 times per second for 48000 Hz would be feasible.

hajimehoshi avatar Sep 17 '21 21:09 hajimehoshi

Yeah, 480 is way more than enough :D Even 16 is probably enough for games. But just to clarify, this would actually make it possible to change the contents of the buffer in such a way that if the speaker is currently playing some sample and I modify a sample that's 1/480 of a second before in the buffer, the speaker would actually end up playing the modified samples when it gets to that point?

faiface avatar Sep 17 '21 21:09 faiface

There is still a small delay due to the low level drivers, OSes, and so on. I set these buffer sizes as small as possible on Oto v2, but still there are.

hajimehoshi avatar Sep 17 '21 21:09 hajimehoshi

I'll be afk soon. See you later :-)

hajimehoshi avatar Sep 17 '21 21:09 hajimehoshi

See ya! Hope we can resolve this eventually!

faiface avatar Sep 17 '21 21:09 faiface

@hajimehoshi so without the buffer between Beep and the driver code, any guesses on how fast the callback has to respond before it glitches?

I had the idea of creating a buffer streamer. Not like the one currently in Beep but more like how Oto buffers the audio: keep 0.5-1sec of audio buffered so it can provide samples quickly and when the buffer gets drained, fill it up with more samples. Having the buffer as a streamer makes it composable again:

[heavy tasks like decoding mp3, resampling etc.] -> [buffer] -> [cheap tasks like volume adjustments, pause/stop etc.] -> [speaker]

I have some thoughts about this but I would like to hear what @faiface thinks about this first.


If people have more/different ideas, I'd love to hear them.

MarkKremer avatar Sep 23 '21 20:09 MarkKremer

so without the buffer between Beep and the driver code, any guesses on how fast the callback has to respond before it glitches?

I think you meant how much internal buffers Oto's drivers have. This depends on environments. In the current implementations:

Android: ? (this depends on the returning value of oboe::AudioStream::getBufferSizeInFrames. This should be very small (0.01 [sec] maybe?). macOS: 2048 [bytes] = 256 [samples] in stereo = 0.005 [sec] in 48000 [Hz] stereo iOS: 12288 [bytes] = 0.032 [sec] in 48000 [Hz] stereo Linux/Unix: 2048 [frames] = 1024 [samples] in stereo = 0.021 [sec] in 48000 [Hz] stereo Windows: 4096 [bytes] = 512 [samples] in stereo = 0.011 [sec] in 48000 [Hz] stereo Wasm: This is special since mixing happens on browser side, not Go side. We can process streams in realtime by Audio Worklet.

Note: Samples are represented in float32 in most cases.

hajimehoshi avatar Sep 24 '21 02:09 hajimehoshi

Quick heads up: https://github.com/hajimehoshi/oto/pull/160 has finally derived into hajimehoshi re-adding configurable buffer sizes in Oto, in case you want to start experimenting with it. The performance and latency is extremely similar to what could already be achieved with UnplayedBufferSize, though (16ms for most desktop environments, and more like 50ms on browsers), but far more convenient and simpler to manage.

tinne26 avatar Mar 28 '22 09:03 tinne26

Any updates?

hajimehoshi avatar Oct 03 '22 16:10 hajimehoshi

If Oto v2 would not be used in Beep, it would be worthless to keep backward compatibility of Oto's public APIs. I'd plan to freeze the project of Oto (v2), and move Oto as an internal package in Ebitengine.

hajimehoshi avatar Oct 04 '22 05:10 hajimehoshi

If Oto v2 would not be used in Beep, it would be worthless to keep backward compatibility of Oto's public APIs. I'd plan to freeze the project of Oto (v2), and move Oto as an internal package in Ebitengine.

Oto is tremendously useful as it's the main Go package to give cross-platform audio playback.

@hajimehoshi, please keep Oto as an external package. It is used also outside of Beep and it's a really useful package.

screenshot_2022-10-04_14:40:55

With kindness, Robin

mewmew avatar Oct 04 '22 12:10 mewmew

re: https://github.com/faiface/beep/issues/128#issuecomment-910340152

I've made a first draft.

@MarkKremer I know you've been working on this. Do you still have your draft work for using Oto v2.0 in Beep?

Cheerful regards, Robin

mewmew avatar Oct 04 '22 12:10 mewmew

I just pushed whatever I had locally: https://github.com/faiface/beep/compare/master...MarkKremer:beep:master

Something something long time ago disclaimer :smile:

MarkKremer avatar Oct 04 '22 13:10 MarkKremer

@hajimehoshi, please keep Oto as an external package. It is used also outside of Beep and it's a really useful package.

Beep is much more widely used than Oto v2.

https://pkg.go.dev/github.com/faiface/beep?tab=importedby

https://github.com/search?l=Go&q=%22github.com%2Fhajimehoshi%2Foto%2Fv2%22&type=Code (66 files Oto v2) vs https://github.com/search?l=Go&q=%22github.com%2Ffaiface%2Fbeep%22&type=Code (1116 files for Beep)

So I thought the impact by stopping maintaining Oto v2 would be limited. That's one of the reasons why I'd stop maitaining Oto v2 as long as Beep didn't use Oto v2. What do you think?

hajimehoshi avatar Oct 04 '22 14:10 hajimehoshi