homebridge-camera-ffmpeg icon indicating copy to clipboard operation
homebridge-camera-ffmpeg copied to clipboard

Add support for two-way audio for SIP based video doorbell

Open nanosonde opened this issue 2 years ago • 11 comments

Hi!

This PR addresses this issue https://github.com/Sunoo/homebridge-camera-ffmpeg/issues/928.

It adds support for two-way audio between Homekit and a SIP-based video doorbell. Those video doorbells based on SIP seem to exist in at least two different flavours:

  1. Video is a completely seperate thing: e.g. MJPEG stream via HTTP only. SIP is only used for Audio.
  2. Video is a H264 stream that is also negotiated as an additional stream (next to Audio) during SIP signaling.

This PR only addresses variant 1). However, adding variant 2) should not be too difficult.

The solution presented here does not involve using audio devices (e.g. alsa loopback) or the like and it does not use an external SIP softphone to handle the SIP call. It uses the well-known SIP stack from kirm.

The SIP call is a direct SIP call between two peers. No proxy or registrar is involved. A SIP INVITE is sent with the suggested RTP stream and codec parameters (SDP) and the related SIP response SDP from the video doorbell is parsed. Based on these two SDPs the two FFmpeg instances are configured accordingly. As a result the SIP video doorbell sends and receives the audio streams directly to/from the homebridge-camera-ffmpeg plugin (two ffmpeg instances). Full-Duplex.

For now the negotiated codec towards the SIP doorbell is hard-coded to G.711 (u-Law). All SIP devices should support this codec. For example, my doorbell only supports G.711 u-Law and A-law.

At the following to your camera config:

                  "sipConfig": {
                        "to": "sip:[email protected]",
                        "from": "sip:[email protected]"
                    }

A big thank you goes to this project: https://github.com/dgreif/ring The Ring cameras speak SIP towards the Ring servers. So I learned all the basics from dgreifs solution.

The RTPhelper was inspired by this project: https://github.com/hjdhjd/homebridge-unifi-protect

The RTPhelper is necessary as my SIP doorbell requires symmetric RTP. As having two FFmpeg instances receiving and sending on the same UDP port does not work, I had to create an additional helper socket for this. Without using symmetric RTP, my doorbell immediately stopped sending RTP audio when it started to receive RTP audio from some random port from FFmpeg.

nanosonde avatar Jan 20 '22 16:01 nanosonde

Oh, this is awesome. I’ll try to review this soon, but I expect I won’t merge it until I finish HKSV support (which shouldn’t be too long now).

Sunoo avatar Jan 20 '22 16:01 Sunoo

Hi @Sunoo!

Of course, please take your time. As this my first typescript/nodejs project, please forgive me.

BTW: I have tested it here a couple of days and it works really stable.

nanosonde avatar Jan 20 '22 16:01 nanosonde

If you want to try it out quickly without having a SIP doorbell, just just use this simple SIP client for Windows for example: https://lite.phoner.de/index_en.htm

If you activate the debug view (you find it under "help"), you will see the SIP signalling. Just skip the SIP account wizward. The important thing in the config are the two IP addesses in the SIP config in TO and FROM. BTW: in FROM you can also specifiy another port number after the IP address in case you have more SIP doorbells that need create a SIP stack each.

nanosonde avatar Jan 20 '22 16:01 nanosonde

What is not implemented yet: the solution only does a direct SIP call to the doorbell. It does not issue a SIP REGISTER and it does not support authentication. This would be required to connect the SIP RINGING message to the doorbell characteristic. However, as my doorbell also offers a HTTP webhook to be called on doorbell button press, this is enough for me.

Another interesting finding for my doorbell: It is registered to my local Fritzbox (router with small SIP PBX inside). And the SIP registration to some other PBX works at the same time as a direct SIP call via IP. At least for my for my SIP doorbell.

nanosonde avatar Jan 20 '22 16:01 nanosonde

@nanosonde ; this sounds like a breakthrough! have been waiting for long to support my doorbird doorbell. I'm eager to test, but fail to understand how i could import this plugin while its waiting for pull-request, so i'll be patient until sunoo has the time to validate and accept. For my understanding; is this function calling OUT to the doorbell or is it also able to have the doorbell directly do a SIP call inbound to homebridge on this plugin?

mbrackjr avatar Feb 04 '22 19:02 mbrackjr

@nanosonde ; this sounds like a breakthrough! have been waiting for long to support my doorbird doorbell. I'm eager to test, but fail to understand how i could import this plugin while its waiting for pull-request, so i'll be patient until sunoo has the time to validate and accept. For my understanding; is this function calling OUT to the doorbell or is it also able to have the doorbell directly do a SIP call inbound to homebridge on this plugin?

The current implementation calls OUT to the doorbell only. As a result the doorbell has to offer another way to signal the doorbell button press. For example my doorbell offers to also call a HTTP webhook. I use the HTTP webhook already provided by this plugin to send the iOS notification with snapshot. If I then open the camera within homekit by clicking on the notification, it will open the Homekit camera as usual. While opening the camera, this SIP feature will then immediately call OUT and the doorbell automatically picks up the call immediately. Thus, two-way audio is also immediately available. However, my doorbell only supports MJPEG video, I do not know what will happen with other codecs that might introduce non-synced audio video.

nanosonde avatar Feb 04 '22 19:02 nanosonde

Ok, sounds great; especially that you just click the default IOS notification makes for a seamless experience. Now I'm even more eager to test, hopefully Sunoo has time soon ;-)

mbrackjr avatar Feb 04 '22 19:02 mbrackjr

Took me a few minutes to figure out how to deploy into homebridge while using the config-UI-x solution. Just got my first successful call. Connection is to an Axis A8105-E door station. For those trying it uses UDP for protocol. Trying to figure out why the SIP payload is consuming so much bandwidth. Using more bandwidth on a peer to peer connection than my normal VOIP at office using asterisk and OPUS codec which is a wide band codec. Awesome work though. Curious if there will be plans to allow editing any core sip settings. Like a SIP.conf file for picking Sip Port, codec's, proxy, ICE, TURN, DTMF's, Media Encryption.

ryan99alero avatar Feb 14 '22 01:02 ryan99alero

Nanosonde, Curious on your tests. How much time was added to your connection time. When connecting I notice SIP connects pretty quick under 2 seconds but the Video part takes longer. Its taking 17 seconds to established full connection currently. When I was just using Sunoo's package without sip I had video connection down to about 8 seconds. May still be something on my end. Wondering if the feed is Pushing audio over both the H.264 feed and SIP and thats whats creating more bandwidth on my end.

ryan99alero avatar Feb 14 '22 01:02 ryan99alero

More playing around with your build and I can get a video without SIP streaming with RTSP audio in 2.5 seconds. Once I enable SIP its 13 seconds. I downloaded LinPhone app as is the only simple Peer-Peer SIP client I know of. Simple is the key word. That will connect to Axis Door Station SIP account and have full duplex audio in under 2 seconds. Guess I could try my current DoorBird DoorStation that the Axis will be replacing and see how it performs with SIP and Video.

ryan99alero avatar Feb 15 '22 00:02 ryan99alero

Thanks so much for your efforts. I'm just trying it out with my 2N Intercom which supports SIP

I set up my config as

            "cameras": [
                {
                    "name": "Intercom test",
                    ...
                    "sipConfig": {
                        "to": "sip:192.168.1.3", // ip of my intercom
                        "from": "sip:192.168.1.2" // ip of my homebridge server
                    }
                }
            ],

But I'm seeing a SIP INVITE error when I open the camera

[11/09/2022, 11:19:18 am] [Camera FFmpeg] sip INVITE request failed with status 481
[11/09/2022, 11:19:18 am] [Camera FFmpeg] [Intercom test] SIP INVITE failed: Error: sip INVITE request failed with status 481
[11/09/2022, 11:19:18 am] [Camera FFmpeg] [Intercom test] Starting video stream: 1280 x 720, 30 fps, 299 kbps (AAC-eld)
[11/09/2022, 11:19:18 am] [Camera FFmpeg] [Intercom test] FFmpeg exited with code: 1 and signal: null (Error)
[11/09/2022, 11:19:18 am] [Camera FFmpeg] [Intercom test] Stopped video stream.
[11/09/2022, 11:19:18 am] [Camera FFmpeg] sip BYE request failed with status 481

When I test a SIP direct call using MicroSIP to my intercom, it works and I see this in the MicroSIP logs

INVITE sip:192.168.1.3 SIP/2.0
Via: SIP/2.0/UDP 172.29.32.1:60420;rport;branch=z9hG4bKPjbf4cb12c764e4dc09fba4b4031e1f0cf
Max-Forwards: 70
From: <sip:172.29.32.1>;tag=ce897c027d3448a1b14b2b431649f2d8
To: <sip:192.168.1.3>
Contact: <sip:192.168.1.33:60420;ob>
Call-ID: 524ec458ee4c42fdb084033f2cfe7bdc
CSeq: 16740 INVITE
Allow: PRACK, INVITE, ACK, BYE, CANCEL, UPDATE, INFO, SUBSCRIBE, NOTIFY, REFER, MESSAGE, OPTIONS
Supported: replaces, 100rel, timer, norefersub
Session-Expires: 1800
Min-SE: 90
User-Agent: MicroSIP/3.21.2
Content-Type: application/sdp
Content-Length:   799

v=0
o=- 3871884119 3871884119 IN IP4 192.168.1.33
s=pjmedia
b=AS:2184
t=0 0
a=X-nat:0
m=audio 4000 RTP/AVP 9 8 0 101
c=IN IP4 192.168.1.33
b=TIAS:64000
a=rtcp:4001 IN IP4 192.168.1.33
a=sendrecv
a=rtpmap:9 G722/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-16
a=ssrc:1786122839 cname:5bf320fe605815c9
m=video 4002 RTP/AVP 99 98 100 101
c=IN IP4 192.168.1.33
b=TIAS:2000000
a=rtcp:4003 IN IP4 192.168.1.33
a=sendrecv
a=rtpmap:99 H264/90000
a=fmtp:99 profile-level-id=42e01e; packetization-mode=1
a=rtpmap:98 H263-1998/90000
a=fmtp:98 CIF=1;QCIF=1
a=rtpmap:100 VP8/90000
a=fmtp:100 max-fr=30; max-fs=580
a=rtpmap:101 VP9/90000
a=fmtp:101 max-fr=30; max-fs=580
a=ssrc:463995847 cname:5bf320fe605815c9
a=rtcp-fb:* nack pli

longzheng avatar Sep 11 '22 01:09 longzheng

Is there someone who could explain how I can install this plugin while using the config-ui-x solution ? Would love to test it…

KoljaV avatar Oct 24 '22 07:10 KoljaV

so if I understand correctly, is there support for SIP calls? @nanosonde, please.. is there a way to test your fork?

mrMiimo avatar Oct 29 '22 16:10 mrMiimo

@nanosonde: Any idea how I can test this pull-request inside homebridge running on RPi while it's awaiting acceptance by @Sunoo

mbrackjr avatar Nov 05 '22 11:11 mbrackjr

@KoljaV In past I had to use CLI. Clone the repo using the brach command and recursive so you get dependencies if any. Then I'd run the NPM update, audit fix, outdated, install then do an NPM pack which converts the build you just created into a tgz file. Finally you'd use the NPM command line to install that package which will be seen in the Config UI admin portal. There may be a few more steps then what I've outlined below. This is just a few commands I saved as a reminder how to get started deploying from a plugin not listed in options or from a different branch.

git clone -b sipdoorbell --single-branch https://github.com/nanosonde/homebridge-camera-ffmpeg.git npm-check-updates npm update npm audit fix npm outdated -g --depth=1 npm install -g npm@latest npm instal npm audit fix npm pack sudo npm install -g homebridge-camera-ffmpeg-3.1.4.tgz

ryan99alero avatar Nov 07 '22 19:11 ryan99alero

@KoljaV in this thread there are several patches other's have done that aren't part of the sipdoorbell branch. You'd maybe want to fork this build on GitHub and apply whatever fixes you'd want that others have done.

ryan99alero avatar Nov 07 '22 19:11 ryan99alero

If you guys can tell me for sure this works, I don’t have a strong problem with merging it somewhat blindly. That should make life easier for the folks who want to take advantage of it.

Sunoo avatar Nov 07 '22 19:11 Sunoo

@KoljaV In past I had to use CLI. Clone the repo using the brach command and recursive so you get dependencies if any. Then I'd run the NPM update, audit fix, outdated, install then do an NPM pack which converts the build you just created into a tgz file. Finally you'd use the NPM command line to install that package which will be seen in the Config UI admin portal. There may be a few more steps then what I've outlined below. This is just a few commands I saved as a reminder how to get started deploying from a plugin not listed in options or from a different branch.

I personally use npm link https://docs.npmjs.com/cli/v8/commands/npm-link/ which is easy to update/test since I just run a build and don't need to pack and install again.

longzheng avatar Nov 07 '22 21:11 longzheng

@KoljaV in this thread there are several patches other's have done that aren't part of the sipdoorbell branch. You'd maybe want to fork this build on GitHub and apply whatever fixes you'd want that others have done.

Yeah I was hoping the original submitter @nanosonde would accept the suggestions into his branch so it can be merged altogether. Otherwise maybe I can look at forking his PR, applying the changes and making a new PR as well.

If you guys can tell me for sure this works, I don’t have a strong problem with merging it somewhat blindly. That should make life easier for the folks who want to take advantage of it.

Based on my testing and suggestions, I think there are a few issues that might also affects others if they're not applied.

longzheng avatar Nov 07 '22 21:11 longzheng

@longzheng If you’re able to correct them in a new PR, I’ll merge that.

Sunoo avatar Nov 07 '22 22:11 Sunoo

@longzheng I recently redeployed my homebridge VM. If you decide to put all into a PR. Let me know the branch and PR ID and i'll redeploy and see if it works for me. I have an AXIS A8105-E I just ordered to replace my DoorBird for better quality. I also have an Axis A8207-VE I got for Work but I'd imagine they both have the same SIP methodology. I'd planned on post upgrade to try and take the DoorBird apart and install a higher quality CMOS sensor. Otherwise I was just going to add your edits into a fork I made from @nanosonde branch.

ryan99alero avatar Nov 07 '22 22:11 ryan99alero

@longzheng If you’re able to correct them in a new PR, I’ll merge that.

@longzheng I recently redeployed my homebridge VM. If you decide to put all into a PR. Let me know the branch and PR ID and i'll redeploy and see if it works for me.

Yep I'll do that after work today.

longzheng avatar Nov 08 '22 00:11 longzheng

@Sunoo @ryan99alero I've opened the new PR https://github.com/Sunoo/homebridge-camera-ffmpeg/pull/1355

longzheng avatar Nov 08 '22 23:11 longzheng

Fantastic, closing in favor of #1355

Sunoo avatar Nov 08 '22 23:11 Sunoo

Yeah I was hoping the original submitter @nanosonde would accept the suggestions into his branch so it can be merged altogether. Otherwise maybe I can look at forking his PR, applying the changes and making a new PR as well.

Thanks for working on this. Personally I have no plans to work on this particular PR for this plugin anymore. Instead I have been working on something new.

I made myself familiar with gstreamer and was able to get a complete homekit pipeline running with it instead of ffmpeg. Mainly because it supports RTP/SAVPF out of the box (RTCP PLI would generate a new key frame in the H264 encoder for example). Something not possible with ffmpeg. Also I got OPUS working after reading the Homekit spec. carefuly about the deviations from the RFCs. It is just in a PoC state in Python, but could probably directly be ported to JS as done here.

Then I came across this comment. I have started to look into the code and observed that @koush has already solved so many things (RTCP feedback, OPUS, etc.) with scrypted and the system is much more flexible with respect to the plugin architecture. So I decided to drop all my work so far and I am currently analysing the options to get my SIP video doorbell working with scrypted. Either by writing an own scrypted plugin or by providing some external solution which bridges between SIP and something that one of the scrypted plugins already support.

After koush's comment that I might have hit a feature gap (*1) in scrypted I am currently evaluating in creating a simple SIP<>ONVIF bridge based on my previous gstreamer experiences. Gstreamer already supports the ONVIF profile T backchannel stuff via RTSP and my Python experiments showed that it is really working. However, as the scrypted ONVIF plugin requires a real ONVIF device to interrogate things like "GetProfiles()" etc. I am also searching for some simple ONVIF mock solution that makes the plugin believe it is talking to a real ONVIF device. Another option that I have been experimenting with: provide some emulated unifi protect server based on FastAPI. Camera streaming would work with RTSP and the audio speaker channel would be realised via websocket.

(*1) The problem with all plugins is the fact that due to having video completely separated (RTSP/HTTP) from audio (SIP+RTP), it would be required to reconfigure the RTSP streaming once the audio connection to the camera is established too. However, to my knowledge reconfiguring RTSP streams during runtime is something that is specified in the RFC, but not very well supported. A possible solution to this would be to use an always existing audio streams which just contains silence when the SIP audio is not connected. This could also be realized quite easy with the gstreamer audiomixer element.

Side note: if the video stream would have been also negotiated via SIP, it would have been easier to directly create a SIP plugin for scrypted based on older version of the ring plugin which used SIP to get video+audio streams from the cloud.

Too many possibilities....

nanosonde avatar Nov 14 '22 08:11 nanosonde

@nanosonde excellent reasoning ... scrypted is a solid foundation on which to build anything. If I understand your thinking, think of Sip to Sip calls (no proxy or registrar is involved), instead there are many intercoms that require access to a SIP server (https://www.linphone.org/sites/default/files/solutions-intercomsystems.pdf) ...

I don't know if it can help you, maybe even with video management, but lately I use Baresip (https://github.com/baresip/baresip) it's and very modular, it allows all types of calls (even SIP / SIP) manages a number infinite of audio and video codecs and can also be integrated with mqtt (useful for managing the call). I don't know if you know but baresip is like linphone in command line ...

I managed to integrate my intercom into homekit using alsa audio loopback, homebridge-ffmpeg and baresip ... audio quality is fantastic but video activation is very slow. In fact I only use the audio part and for the video part I add an encrypted rtsp video stream

mrMiimo avatar Nov 14 '22 10:11 mrMiimo