swift-opus icon indicating copy to clipboard operation
swift-opus copied to clipboard

Add opus custom support

Open emlynmac opened this issue 2 years ago • 15 comments

I've been working on an implementation of Jamulus' audio protocol, which uses a custom Opus setting.

The PR contains changes to get the custom mode up and running.

emlynmac avatar Apr 19 '22 19:04 emlynmac

Big PR, thanks! Some initial feedback that I think should be addressed before I can do a thorough review:

  1. I think there should be a CustomMode Swift type that wraps or is the underlying Opus C type.
  2. The constructors for Encoder and Decoder could accept a CustomMode, or implicitly handle it if the specified frame size isn’t standard.
  3. Please format according to the swiftformat spec in this repo. Changes with unformatted or differently formatted code will not be accepted.
  4. Please separate out any submodule changes into another PR (or remove them).

Thanks!

ydnar avatar Apr 19 '22 22:04 ydnar

Big PR, thanks! Some initial feedback that I think should be addressed before I can do a thorough review:

  1. I think there should be a CustomMode Swift type that wraps or is the underlying Opus C type.
  2. The constructors for Encoder and Decoder could accept a CustomMode, or implicitly handle it if the specified frame size isn’t standard.
  3. Please format according to the swiftformat spec in this repo. Changes with unformatted or differently formatted code will not be accepted.
  4. Please separate out any submodule changes into another PR (or remove them).

Thanks!

I forgot to update the submodule; done Just ran swiftformat too

To your first points around breaking those out further, I originally started down this path but stopped as it didn't really help the module be more usable from an API perspective. The underlying calls are different enough that to wrapping them up together was clearer and didn't need to change the existing code at all.

emlynmac avatar Apr 19 '22 22:04 emlynmac

To your first points around breaking those out further, I originally started down this path but stopped as it didn't really help the module be more usable from an API perspective. The underlying calls are different enough that to wrapping them up together was clearer and didn't need to change the existing code at all.

As I understand it, Opus custom encoders/decoders exist primarily (or only) to support nonstandard frame sizes. I don’t think that justifies a new, somewhat duplicative expansion of this package’s public API.

How about this: could you propose a minimal API change that enables nonstandard frame sizes? We can start right here in the PR comments.

Once we have a solid API proposal, then implement it.

ydnar avatar Apr 20 '22 15:04 ydnar

To your first points around breaking those out further, I originally started down this path but stopped as it didn't really help the module be more usable from an API perspective. The underlying calls are different enough that to wrapping them up together was clearer and didn't need to change the existing code at all.

As I understand it, Opus custom encoders/decoders exist primarily (or only) to support nonstandard frame sizes. I don’t think that justifies a new, somewhat duplicative expansion of this package’s public API.

How about this: could you propose a minimal API change that enables nonstandard frame sizes? We can start right here in the PR comments.

Once we have a solid API proposal, then implement it.

If there's a preferred API you'd like to make, I'm all up for that. I agree the almost duplication of some of the helper functions to expand the pointers is not ideal, but as I said - I did not want to disturb the other parts of the code here. The crux of the issue is being able to handle the fame size and also to call opus_custom_encode rather than opus_encode. I've also had to ensure that the expected packet size and nils are handled by the decoder too.

Speaking of the existing code, it looks like it's not handling dropped packets. My understanding is that if you have a dropped packet you should feed nil into the decoder to cover that case.

The decode method in Opus.Decoder isn't able to handle that:

private func decode(_ input: UnsafeBufferPointer<UInt8>, to output: UnsafeMutableBufferPointer<Float32>) throws -> Int {
		let decodedCount = opus_decode_float(
			decoder,
			input.baseAddress!,
			Int32(input.count),
			output.baseAddress!,
			Int32(output.count),
			0
		)
		if decodedCount < 0 {
			throw Opus.Error(decodedCount)
		}
		return Int(decodedCount)
	}

emlynmac avatar Apr 20 '22 16:04 emlynmac

Ok, I found some time :)

I've merged the coding / decoding changes in with the existing classes. See what you think of the changes!

emlynmac avatar Apr 20 '22 19:04 emlynmac

Speaking of the existing code, it looks like it's not handling dropped packets. My understanding is that if you have a dropped packet you should feed nil into the decoder to cover that case.

The decode method in Opus.Decoder isn't able to handle that:

Good observation. I’d be game to add support for this in a separate PR.

My instinct says that a decodeDroppedPacket method (rather than interpreting nil) is probably the right approach. Thoughts?

ydnar avatar Apr 21 '22 17:04 ydnar

Speaking of the existing code, it looks like it's not handling dropped packets. My understanding is that if you have a dropped packet you should feed nil into the decoder to cover that case. The decode method in Opus.Decoder isn't able to handle that:

Good observation. I’d be game to add support for this in a separate PR.

My instinct says that a decodeDroppedPacket method (rather than interpreting nil) is probably the right approach. Thoughts?

I've implemented a fix in the PR; no need to make a different method to handle the case of a pretty regular occurrence. Will make the API kinda strange to have to call a different decode API for nil.

emlynmac avatar Apr 21 '22 17:04 emlynmac

Speaking of the existing code, it looks like it's not handling dropped packets. My understanding is that if you have a dropped packet you should feed nil into the decoder to cover that case. The decode method in Opus.Decoder isn't able to handle that:

Good observation. I’d be game to add support for this in a separate PR. My instinct says that a decodeDroppedPacket method (rather than interpreting nil) is probably the right approach. Thoughts?

I've implemented a fix in the PR; no need to make a different method to handle the case of a pretty regular occurrence. Will make the API kinda strange to have to call a different decode API for nil.

But from the caller’s perspective, a dropped packet isn’t nil. It just never arrived, hence can be handled with a more specific API. The goal of this package was to present a Swift-like interface to the Opus library, and not necessarily mirror its C API.

ydnar avatar Apr 21 '22 17:04 ydnar

Speaking of the existing code, it looks like it's not handling dropped packets. My understanding is that if you have a dropped packet you should feed nil into the decoder to cover that case. The decode method in Opus.Decoder isn't able to handle that:

Good observation. I’d be game to add support for this in a separate PR. My instinct says that a decodeDroppedPacket method (rather than interpreting nil) is probably the right approach. Thoughts?

I've implemented a fix in the PR; no need to make a different method to handle the case of a pretty regular occurrence. Will make the API kinda strange to have to call a different decode API for nil.

But from the caller’s perspective, a dropped packet isn’t nil. It just never arrived, hence can be handled with a more specific API. The goal of this package was to present a Swift-like interface to the Opus library, and not necessarily mirror its C API.

If you look at how you'd be using this, it would be from a jitter buffer. Typically that will just have the read method called periodically as the audio system requests more data. The buffer would either return a compressed packet or a nil if no data. Simply feeding that into the encoder is somewhat cleaner than calling a separate method that ultimately calls the same opus method under the hood.

emlynmac avatar Apr 21 '22 17:04 emlynmac

If you look at how you'd be using this, it would be from a jitter buffer. Typically that will just have the read method called periodically as the audio system requests more data. The buffer would either return a compressed packet or a nil if no data. Simply feeding that into the encoder is somewhat cleaner than calling a separate method that ultimately calls the same opus method under the hood.

Right now, there are two public decode methods. One accepts Data, and the other accepts UnsafeBufferPointer<UInt8>.

Adding support for nil packets to both of these methods doesn’t seem great, and adding two additional methods to decode a dropped packet also doesn’t seem great.

ydnar avatar Apr 21 '22 17:04 ydnar

If you look at how you'd be using this, it would be from a jitter buffer. Typically that will just have the read method called periodically as the audio system requests more data. The buffer would either return a compressed packet or a nil if no data. Simply feeding that into the encoder is somewhat cleaner than calling a separate method that ultimately calls the same opus method under the hood.

Right now, there are two public decode methods. One accepts Data, and the other accepts UnsafeBufferPointer<UInt8>.

Adding support for nil packets to both of these methods doesn’t seem great, and adding two additional methods to decode a dropped packet also doesn’t seem great.

Both of those methods already have support in my PR. No changes needed; just using an empty data signifies a nil, which is then passed into the decode method. This has the added benefit of not crashing in the case where the bufferPtr is nil.

emlynmac avatar Apr 21 '22 17:04 emlynmac

@emlynmac thanks for your efforts here, I appreciate it.

I think we should split out decoding dropped packets into a separate thread of work (probably a new PR), which is relevant for both custom and non-custom modes. I think this could help improve/inform the public API design for custom mode support.

ydnar avatar Apr 21 '22 17:04 ydnar

@emlynmac thanks for your efforts here, I appreciate it.

I think we should split out decoding dropped packets into a separate thread of work (probably a new PR), which is relevant for both custom and non-custom modes. I think this could help improve/inform the public API design for custom mode support.

You're welcome, I needed it to get jamulus working on swift, so I figured I'd pass it back.

If you want to refactor the dropping into another method, then that's def another PR. I need it to work for now, but that's why we have forks :)

emlynmac avatar Apr 21 '22 17:04 emlynmac

I want this to land. Supporting dropped packets and supporting custom modes are two distinct and valuable features, and should be treated as such. Given that dropped packet support is relevant for all users of this package, I think it’s a good candidate to extract and land first. Then we can streamline and simplify the discussion around the design for support for custom modes.

ydnar avatar Apr 21 '22 17:04 ydnar

I want this to land. Supporting dropped packets and supporting custom modes are two distinct and valuable features, and should be treated as such. Given that dropped packet support is relevant for all users of this package, I think it’s a good candidate to extract and land first. Then we can streamline and simplify the discussion around the design for support for custom modes.

From my perspective, the cleanest way would be to let the public decode methods take optionals. That way, you can have a separate internal method to handle the dropped packet case, but the external API is unchanged.

emlynmac avatar Apr 21 '22 17:04 emlynmac