homebridge-camera-ffmpeg icon indicating copy to clipboard operation
homebridge-camera-ffmpeg copied to clipboard

Two-way audio discussion.

Open hjdhjd opened this issue 3 years ago • 50 comments

@Sunoo to continue the discussion on two-way audio from elsewhere:

With respect to the topic of two-way audio we started...I think the way you get it back to the camera is ultimately going to be camera-specific. For example, I have two plugins I actively work on - homebridge-unifi-protect2 and homebridge-doorbird. Both have two different ways of taking audio back.

The way I’d suggest solving this in a general way is to not worry about it and make it the user’s problem to a degree. Use a callback or function pointer to call a custom function that has the job of sending that audio out to it’s final destination.

Perhaps create an extension system where people can contribute camera/manufacturer-specific two-way “send” functions...and then you can enable support in a consistent, generic way.

Those of us that look to your codebase as a starting point can leverage that as well to complete our own implementations. Thoughts?

hjdhjd avatar Aug 18 '20 21:08 hjdhjd

I’d just have to think of how that would actually work. I have done some things with helper plugins to add support for FTP and SMTP motion alerts, perhaps I could come up with some way of doing a similar thing for audio return.

I’m a little reluctant to add just dead code that’s only used by people who fork the plugin, but that would obviously be the easiest path forward. I’m just having a little trouble envisioning a reasonable way for two plugins to talk to each other to accomplish this.

Sunoo avatar Aug 18 '20 21:08 Sunoo

@hjdhjd Oh, and semi-related, but me and a bunch of other camera plugin developers are on the Homebridge Discord, if you ever wanted to talk there instead.

Sunoo avatar Aug 18 '20 21:08 Sunoo

Happy to. Point me to it? We can pick this up there.

hjdhjd avatar Aug 18 '20 22:08 hjdhjd

@hjdhjd https://discord.gg/Z8jmyvb

Sunoo avatar Aug 18 '20 22:08 Sunoo

HomeKit manages two-way audio like an rtsp "backchannel", ffmpeg won't support it easily because "out of the box" it handles many input streams but just one output! Changing this behavior requires a deep rewriting of the ffmpeg run loop and overall architecture, so in the end it won't be ffmpeg anymore but a new program..

I successfully run two-way audio with an udp proxy that "splits" the rtcp and backchannel traffic and another ffmpeg to handle the latter like homebridge-doorbird does.

Unfortunately I hate TypeScript while I love JavaScript beeing an "untyped" true scripting language (do you know Lisp?) so I won't contribute to TypeScript plugins sorry.

I also started to think that, given the increased complexity of the task, it would be better to build cameras bridges using the Apple open source HK framework.

cheers

llemtt avatar Aug 19 '20 07:08 llemtt

The easy way to handle that is just to run a second instance of FFmpeg to handle the return audio.

The more complex issue I’m thinking through is how to handle actually sending audio back at the camera. I’ve basically come to the conclusion that I’ll have to pass the audio off to another Homebridge plugin that can handle actually sending it on to the camera.

Sunoo avatar Aug 19 '20 14:08 Sunoo

I’ve basically come to the conclusion that I’ll have to pass the audio off to another Homebridge plugin that can handle actually sending it on to the camera.

Why do you think that? I actually use the output of the return "ffmpeg" instance.

BTW you can't use the rtsp backchannel if your camera support it (onvif requires that for two-way audio), you have to use some other separate api. (the rtsp backchannel require negotiation/setup/control to be done in the context of the rtsp session that is actually managed by the "main" ffmpeg instance)

llemtt avatar Aug 19 '20 15:08 llemtt

Because as far as I can tell, there is no standard for sending audio back to cameras. Every camera seems to do its own thing. Handing that implementation off to a camera-specific plugin seems like the best approach to me?

Or am I mistaken and there is a standard, or small set of standards, that I could have a hope of implementing? I don’t actually own any cameras that have two way audio support at the moment, so I can’t say for sure.

Sunoo avatar Aug 19 '20 15:08 Sunoo

There are lots of different (often proprietary) ways to get audio back. Just look at Ring, Nest, Doorbird, and UniFi Protect Doorbell (I'm develop on the last two)...they all have different mechanisms for getting audio back.

hjdhjd avatar Aug 19 '20 15:08 hjdhjd

@llemtt I saw your note above...do you have an example some of us (i.e. me :smile:) can use as at least a reference point as we tackle two-way audio in our respective plugins? I get TS isn't your jam. And yeah...I know lisp...and elisp...and that's taking me back aways. :smile:

Specifically...what're the ffmpeg (or other tools) and the respective command lines you used to execute a UDP proxy?

Right now, I'm playing with just plain trying to get audio streamed via UDP to a device. It doesn't seem to accept RTSP, it looks more like just pure AAC over UDP. Thoughts?

hjdhjd avatar Aug 19 '20 15:08 hjdhjd

@hjdhjd You can see a good example of that in the Ring plugin with the RTPSplitter dgreif wrote for that.

Sunoo avatar Aug 19 '20 16:08 Sunoo

@Sunoo I've looked at it...I'm concerned about timing issues with that particular approach, but it's a start for sure.

hjdhjd avatar Aug 19 '20 17:08 hjdhjd

I don’t know how you’d do it otherwise, honestly.

Sunoo avatar Aug 19 '20 17:08 Sunoo

That's what makes these things so fun... :smile:

hjdhjd avatar Aug 19 '20 17:08 hjdhjd

ffmpeg.js.txt

attached is my version of ffmpeg.js I currently use inside the homebridge-videodoorbell plugin

Ring plugin do almost exactly what I did (I even copied their fdk-aac codec configuration because it works better than mine..)

There was also a proxy rtp implementation inside hap-nodejs, but I never understood how to get it working.

llemtt avatar Aug 19 '20 17:08 llemtt

Right now, I'm playing with just plain trying to get audio streamed via UDP to a device. It doesn't seem to accept RTSP, it looks more like just pure AAC over UDP. Thoughts?

Can you already stream an audio (file) with ffmpeg to that device?

llemtt avatar Aug 19 '20 17:08 llemtt

@llemtt Thanks I'll take a look in a bit. As to streaming via ffmpeg from the command line...that's the step I'm currently battling with at the moment. Step 1 is to be able to get anything to output out of the damn thing period. As I said...it looks like it takes AAC over UDP...just trying to figure out how to send it without encapsulating it in a transport protocol like mpegts or others...thoughts?

hjdhjd avatar Aug 19 '20 17:08 hjdhjd

I spend a chunk of today doing some digging into this topic, and it seems like there are two main methods of sending audio back to a camera: VAPIX's HTTP POST-based method, and ONVIF's RTSP audio backchannel.

Both methods look possible to support using FFmpeg, so I plan to drop the extension plugin idea (at least for now) and target both of those to begin with. Just looking at standards, VAPIX looks easier, however based on what I've read today, RTSP backchannel looks to be possible in most if not all cases without technically doing the ONVIF negotiation. I believe I should be able to implement actual ONVIF support if needed though, but that wouldn't be part of the initial two-way audio version.

I'm trying to track down a cheap camera that supports one or both of these methods to use to develop against. It looks like the cheapest reasonable camera will likely be a second-hand AXIS camera which supports VAPIX, I just need to track one down on eBay or similar. Unfortunately, ONVIF profile T cameras (the profile that supports two-way audio) seem to be much more expensive and harder to identify, so I doubt I'll end up getting my hands on one,

Sunoo avatar Aug 20 '20 02:08 Sunoo

It's a simpler, if less flexible, approach. I'll be watching eagerly.

hjdhjd avatar Aug 20 '20 02:08 hjdhjd

Yea, but it should be a decent starting place at least. I may revisit the idea of kicking the audio over to other extensions in the future, but I'm wondering if those scenarios make more sense for someone to just fork this plugin at that point.

Sunoo avatar Aug 20 '20 02:08 Sunoo

Totally agree. This is going to be an iterative process, no doubt. Let's start somewhere. Eager to see the next step one once you find a camera or two...

hjdhjd avatar Aug 20 '20 02:08 hjdhjd

I agree too, one-way audio cameras, two-way audio (surveillance) cameras and video doorbells are different products that are meant to support very different use cases although they share 99% of the technology, so one plugin to control them all maybe it's not a good idea..

Cameras with two-way audio are few and usually expensive, and some of them don't even incorporate a speaker nor an amplifier (just line level out) which means you must buy and install additional hardware. It makes a lot of sense to me to assess what's actually on the market before going ahead and decide what eventually support, although the solution I implemented is "configurable" and works with whatever camera or device you can send audio to using ffmpeg. (e.g. raspy diy camera)

VAPIX and other HTTP POST based devices are the easiest to work with indeed!

llemtt avatar Aug 20 '20 09:08 llemtt

I use this plugin for my 2N doorbell (among other IP cameras). Two-way audio support interests me greatly because currently I have to use/host a 3CX SIP server and run the 3CX app on my phone for doorbell functionality to work.

I'm not sure if it covers all use cases but I believe most IP-based doorbells (e.g. Ring, DoorBird) support the SIP protocol for two-way audio communication. I wonder/suggest if supporting SIP natively would cover most of the use cases?

I have a quick look into how the homebridge-ring plugin works and it's my understanding it basically initiates a SIP call with the doorbell (it is my understanding all SIP compatible devices can act as both SIP servers and SIP clients) with the audio encoded by ffmpeg and packaged as SRTP. If that could be standardized that would be amazing.

longzheng avatar Aug 25 '20 07:08 longzheng

@longzheng If you want your doorbell "ring" your phone you have to use SIP or something similar (Facetime? WZP?) because HomeKit can only trigger a notification that barely emits a single "ping" I can never hear.

I do that in my plugin using linphone, just to ring the phone then I get into the homekit camera to talk. If homekit support of videodoorbell doesn't improve I'll move back to a SIP-like solution.

I considered also buying the 2N, but it looked to much expensive and diy it's more funny!

llemtt avatar Aug 25 '20 07:08 llemtt

@longzheng If you want your doorbell "ring" your phone you have to use SIP or something similar (Facetime? WZP?) because HomeKit can only trigger a notification that barely emits a single "ping" I can never hear.

I do that in my plugin using linphone, just to ring the phone then I get into the homekit camera to talk. If homekit support of videodoorbell doesn't improve I'll move back to a SIP-like solution.

I considered also buying the 2N, but it looked to much expensive and diy it's more funny!

Right, I've already set up this plugin's doorbell feature using a HTTP trigger (the 2N doorbell has UI to configure HTTP triggers and events).

I do see the doorbell notifications but I admit because I've got the 3CX SIP call at the same time, I don't know if I would miss HomeKit-only doorbell notifications.

What do you use to "ring" the phone? VOIP or something like Twilio?

longzheng avatar Aug 25 '20 08:08 longzheng

What do you use to "ring" the phone? VOIP or something like Twilio?

I have a SIP account on the linphone (free) registar and I use the linphonec CLI client on the raspi to issue the call.

llemtt avatar Aug 25 '20 09:08 llemtt

@longzheng I’m gonna be honest, I don’t expect that I’ll be adding SIP support to this plugin, that sounds like the kind of thing best suited for a fork specific to that device. I’m just not sure that SIP is common enough of a return audio method, and it would add a decent amount of complexity.

If you have any documentation on how your doorbell works, I’ll take a look though.

Sunoo avatar Aug 25 '20 13:08 Sunoo

@Sunoo No worries, appreciate the heads up.

The actual standard/protocol is called "SIP Direct Call", which allows you to make a connection to the device without a SIP server/proxy (as SIP is normally set up). Basically I can use a standard SIP client/app and point it to the IP of the doorbell, and initiating a SIP call will just work. Some info about it here https://stackoverflow.com/questions/8516133/how-can-i-make-call-between-direct-ip-to-ip-without-sip-server

There's not much documentation about it on my doorbell's manufacturer's website except to say it works https://wiki.2n.cz/hip/inte/latest/en/1-pbx/direct-call

homebridge-ring seems to use this SIP Direct Call behaviour in their plugin for the two-way audio functionality https://github.com/dgreif/ring/commit/0bdb15410b72deba3e89e562ce7164d839574adb

longzheng avatar Aug 26 '20 00:08 longzheng

Yea, I’ve talked to dgreif quite a bit about two-way in general (and also use the Ring plugin myself), and have been glad that I’ve managed to avoid dealing with SIP myself. :P

Maybe once I get easier methods done I’ll jump into SIP, but without a device to test against, it’ll be very hard to be sure I’ve got anything right.

Sunoo avatar Aug 26 '20 03:08 Sunoo

Yea, I’ve talked to dgreif quite a bit about two-way in general (and also use the Ring plugin myself), and have been glad that I’ve managed to avoid dealing with SIP myself. :P

Yeah I appreciate what you mean, I took a look at the SIP code as well and it is pretty complex to understand.

Maybe once I get easier methods done I’ll jump into SIP, but without a device to test against, it’ll be very hard to be sure I’ve got anything right.

So it is my understanding a lot of SIP clients/apps also support SIP direct calling behaviour, so in theory you should be able to install a SIP app on a Mac/Windows/iOS/Android device and then "call" that device using the IP.

For example this is a list of "softphones" that my doorbell claims to work via Direct Call as well (in this scenario the intercom would be direct calling the softphone via IP) https://wiki.2n.cz/hip/inte/latest/en/3-softphones I'd imagine for testing you could use any of these softphones as well to simulate as an intercom.

longzheng avatar Aug 26 '20 04:08 longzheng