go2rtc Unifi 2-way audio

I'm trying to get the 2-way audio working on my Unifi doorbell while I can hear my voice on the streem it never comes out of the speaker of the doorbell. Did anybody manage to get 2-way audio working for Unifi?

Mar 03 '23 22:03 victor-perez

I have never heard that Unify uses open standards for two way audio.

Mar 04 '23 02:03 AlexxIT

Have a look here: https://github.com/hjdhjd/homebridge-unifi-protect/blob/c20ec04dd995dd989c5754955624c2979497555a/src/protect-stream.ts#L724

Basically it seems that they send AAC codec data in ADTS format via a websocket connection to play audio on the camera speaker.

https://wiki.multimedia.cx/index.php/ADTS

Mar 30 '23 16:03 nanosonde

AAC/ASTS is a problem. Browsers don't support receiving this codec from microphone. Maybe codec can be changed?

Mar 30 '23 17:03 AlexxIT

@victor-perez, did you, by chance, find a way to make this work?

As go2rtc does support FFmpeg as source I would expect that it is possible to make that work when the expected codec is known (which it is based on your link) and then transcode as needed.

But I'm afraid my experience with FFmpeg is to little to implement this myself.

Jun 13 '23 15:06 ab-tools

AAC/ASTS is a problem. Browsers don't support receiving this codec from microphone. Maybe codec can be changed?

Just throwing this out there, but maybe wasm (ffmpeg) and/or WebCodecs API could be viable options? It looks like WebCodecs API is available in recent Chrome/Edge builds.

Oct 02 '23 02:10 scottt732

interested in this as well - any config examples that work for anyone would be appreciated. I can't get anything to work.

Dec 03 '23 19:12 kksligh

Could I help with this feature request somehow? I tried installing Scrypted and it is able to send two way audio from Chrome to a Unifi camera. I don't know whether it's doing server side transcoding or not.

I've been building out some nice HA dashboards with go2rtc and it would be awesome to have two way audio to my Unifi cameras via go2rtc.

Feb 10 '24 05:02 scelfo

Also very interested in seeing this one. There seems to be no good solution for Unifi 2-way audio integration in Home Assistant at the moment.

Aug 13 '24 20:08 Glis6

+1

I've tried several things since February and am now more confident that there is no good solution for Unifi 2-way audio.

Aug 13 '24 22:08 scelfo

While I still use Frigate as my primary NVR - I have discovered that Scrypted has a plugin for Unifi cameras that allows consistent 2 way communication over HTTPS. You have to use their Card and you have to pay for the basic NVR services to unlock the API token, but it works consistently.

https://docs.scrypted.app/home-assistant.html

Aug 14 '24 05:08 kksligh

I'm using Scrypted but the lag on some of the cameras is too long for me. Did anyone sort this out in the meanwhile ?

Oct 08 '24 16:10 Jens-Wymeersch

Has anyone new infos to this? I have 2way audio when I provide my unifi protect doorbell via homebridge to apple homekit. No problems there. But can't figure it out with go2rtc. Can anyone help please?

Nov 01 '24 13:11 tmryvz

Did you look at Scrypted ?

Nov 01 '24 16:11 Jens-Wymeersch

Yes it works there too. Trying to figure how to add that functionality to home assistant. Any ideas?

Nov 01 '24 16:11 tmryvz

The only way I found was using the Scrypted iframes.

Nov 01 '24 16:11 Jens-Wymeersch

Is this only for the paid scrypted nvr version?

Nov 01 '24 19:11 tmryvz

This is correct.

Nov 01 '24 19:11 Jens-Wymeersch

If only go2rtc would provide a working stream for 2-way audio. I want to use a open source and free method. Can you think of any other way to use doorbel g4 pro in Home Assistant for 2-way audio?

I mean it is possible. Homekit and Scrypted works fine...

Nov 01 '24 20:11 tmryvz

Looks like HomeBridge uses ffmpeg to transcode the stream and passes it to the web socket.

https://github.com/hjdhjd/homebridge-unifi-protect/blob/87fc75de4671ded6c0f5a7eb194c5721b8f35818/src/protect-stream.ts#L797

It stands to reason the same approach could be used. Some of the magic values could probably be copied from there.

Nov 07 '24 04:11 ViViDboarder

Hello,

Did anyone figure this out ?

BR, Jens

Jan 20 '25 07:01 Jens-Wymeersch

Very interested in this as well. I wish there is an easy way to integrate this with HA, and without paying for Scrypted. These cameras already come with Unifi Protect NVR, I don't want to pay for another NVR!

Feb 20 '25 18:02 unknownsolo

I'm interested in this for the standalone docker image. I'll dig into the code a bit when I have time. Subscribed for now.

Apr 12 '25 23:04 dudo

FYI Its now possible to start a talk-back session through official Unifi Protect API

May 15 '25 11:05 t3hk0d3

curl --http1.1 -v -k -X POST 'https://192.168.1.1/proxy/protect/integration/v1/cameras/681e457f03b8dc03e4256d97/talkback-session' -H "X-API-KEY: $UNIFI_API_KEY" -H 'Accept: application/json'
> POST /proxy/protect/integration/v1/cameras/681e457f03b8dc03e4256d97/talkback-session HTTP/1.1
> Host: 192.168.1.1
> User-Agent: curl/8.10.1
> X-API-KEY: <redacted>
> Accept: application/json
> 
< HTTP/1.1 200 OK
< Server: nginx
< Date: Thu, 15 May 2025 11:54:00 GMT
< Content-Type: application/json; charset=utf-8
< Content-Length: 88
< Connection: keep-alive
< 
{
  "url": "rtp://192.168.1.56:7004",
  "codec": "opus",
  "samplingRate": 24000,
  "bitsPerSample": 16
}

Seems to be working. Looks like it always returns same URL/params (camera IP-address/port and params). Maybe its not necessary to call that endpoint every time you need a two-way audio.

May 15 '25 11:05 t3hk0d3

🎉

Seems like that URL is working at any time (without need to call API endpoint prior), and i can actually send audio at any moment of time, and it is actually played:

$ ffmpeg -re -i sample-9s.wav -ar 24000 -sample_fmt s16  -vn -acodec libopus -f rtp rtp://192.168.1.56:7004
ffmpeg version n7.0.2 Copyright (c) 2000-2024 the FFmpeg developers
  built with gcc 14.2.1 (GCC) 20240910
  configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-amf --enable-avisynth --enable-cuda-llvm --enable-lto --enable-fontconfig --enable-frei0r --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libdav1d --enable-libdrm --enable-libdvdnav --enable-libdvdread --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libharfbuzz --enable-libiec61883 --enable-libjack --enable-libjxl --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libplacebo --enable-libpulse --enable-librav1e --enable-librsvg --enable-librubberband --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpl --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-nvdec --enable-nvenc --enable-opencl --enable-opengl --enable-shared --enable-vapoursynth --enable-version3 --enable-vulkan
  libavutil      59.  8.100 / 59.  8.100
  libavcodec     61.  3.100 / 61.  3.100
  libavformat    61.  1.100 / 61.  1.100
  libavdevice    61.  1.100 / 61.  1.100
  libavfilter    10.  1.100 / 10.  1.100
  libswscale      8.  1.100 /  8.  1.100
  libswresample   5.  1.100 /  5.  1.100
  libpostproc    58.  1.100 / 58.  1.100
[aist#0:0/pcm_s16le @ 0x623e538dcb00] Guessed Channel Layout: stereo
Input #0, wav, from 'sample-9s.wav':
  Duration: 00:00:09.59, bitrate: 1411 kb/s
  Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> opus (libopus))
Press [q] to stop, [?] for help
[libopus @ 0x623e538aae40] No bit rate set. Defaulting to 96000 bps.
Output #0, rtp, to 'rtp://192.168.1.56:7004':
  Metadata:
    encoder         : Lavf61.1.100
  Stream #0:0: Audio: opus, 24000 Hz, stereo, s16, 96 kb/s
      Metadata:
        encoder         : Lavc61.3.100 libopus
SDP:
v=0
o=- 0 0 IN IP4 127.0.0.1
s=No Name
c=IN IP4 192.168.1.56
t=0 0
a=tool:libavformat 61.1.100
m=audio 7004 RTP/AVP 97
b=AS:96
a=rtpmap:97 opus/48000/2
a=fmtp:97 sprop-stereo=1

[out#0/rtp @ 0x623e538dcc80] video:0KiB audio:119KiB subtitle:0KiB other streams:0KiB global headers:0KiB muxing overhead: 4.756492%
size=     125KiB time=00:00:09.58 bitrate= 106.9kbits/s speed=1.06x

Its funny thats there is no authentication/authorization at all.

Basically anyone who has networking access to camera/doorbell can play any sounds/music at any time.

May 15 '25 12:05 t3hk0d3

Well, i was wrong:

Just tried with another camera.
You need to call that endpoint at least once to play audio thru RTP.
Without calling endpoint you still can send audio, but it won't be played.
After calling endpoint you can stream audio at any moment of time (it doesn't seems to expire even after 15 minutes).
Audio is played with almost 0 delay.

May 15 '25 12:05 t3hk0d3

I have feeling that two-way audio can be implemented with just super-tricky ffmpeg_source/’exec’ config - we would need to somehow combine both ffmpeg RTSP input video/audio stream and RTP audio output stream

May 15 '25 12:05 t3hk0d3

I wonder if Stream #0:1: Audio: opus, 48000 Hz, stereo, fltp is actually a talk-back channel 🤔

May 15 '25 13:05 t3hk0d3

Somehow made it work with exec:

exec:ffmpeg -re -i rtsps://192.168.1.1:7441/aSrjYl35MXKCsF22 -i pipe: -map 0:0 -map 0:1 -map 0:2 -c copy -rtsp_transport tcp -f rtsp {output} -map 1:0 -ar 24000 -vn -acodec libopus -ar 24000 -sample_fmt s16 -b:a 32k -f rtp rtp://192.168.2.125:7004#backchannel=1

Summary:

Reads camera RTSP stream as Stream 0
Reads audio from stdin as Stream 1
From Stream 0 writes output to {output} as RTSP
From Stream 1 writes to Camera's RTP audio stream with transcoding to opus audio coded

Obviously it doesn't work very well with Raspberry PI hardware 😒

May 15 '25 13:05 t3hk0d3

After some testing seems like following config works much better (go2rtc automatically route streams itself):

streams:
  test: 
  - rtspx://192.168.1.1:7441/aSrjYl35MXKCsF22#backchannel=0
  - "exec:ffmpeg -re -fflags nobuffer -f s16be -ar 8000 -i - -vn -acodec libopus -ar 24000 -sample_fmt s16 -b:a 32k -f rtp rtp://192.168.2.125:7004#backchannel=1"

May 15 '25 22:05 t3hk0d3