Unifi 2-way audio
I'm trying to get the 2-way audio working on my Unifi doorbell while I can hear my voice on the streem it never comes out of the speaker of the doorbell. Did anybody manage to get 2-way audio working for Unifi?
I have never heard that Unify uses open standards for two way audio.
Have a look here: https://github.com/hjdhjd/homebridge-unifi-protect/blob/c20ec04dd995dd989c5754955624c2979497555a/src/protect-stream.ts#L724
Basically it seems that they send AAC codec data in ADTS format via a websocket connection to play audio on the camera speaker.
https://wiki.multimedia.cx/index.php/ADTS
AAC/ASTS is a problem. Browsers don't support receiving this codec from microphone. Maybe codec can be changed?
@victor-perez, did you, by chance, find a way to make this work?
As go2rtc does support FFmpeg as source I would expect that it is possible to make that work when the expected codec is known (which it is based on your link) and then transcode as needed.
But I'm afraid my experience with FFmpeg is to little to implement this myself.
AAC/ASTS is a problem. Browsers don't support receiving this codec from microphone. Maybe codec can be changed?
Just throwing this out there, but maybe wasm (ffmpeg) and/or WebCodecs API could be viable options? It looks like WebCodecs API is available in recent Chrome/Edge builds.
interested in this as well - any config examples that work for anyone would be appreciated. I can't get anything to work.
Could I help with this feature request somehow? I tried installing Scrypted and it is able to send two way audio from Chrome to a Unifi camera. I don't know whether it's doing server side transcoding or not.
I've been building out some nice HA dashboards with go2rtc and it would be awesome to have two way audio to my Unifi cameras via go2rtc.
Also very interested in seeing this one. There seems to be no good solution for Unifi 2-way audio integration in Home Assistant at the moment.
+1
I've tried several things since February and am now more confident that there is no good solution for Unifi 2-way audio.
While I still use Frigate as my primary NVR - I have discovered that Scrypted has a plugin for Unifi cameras that allows consistent 2 way communication over HTTPS. You have to use their Card and you have to pay for the basic NVR services to unlock the API token, but it works consistently.
https://docs.scrypted.app/home-assistant.html
I'm using Scrypted but the lag on some of the cameras is too long for me. Did anyone sort this out in the meanwhile ?
Has anyone new infos to this? I have 2way audio when I provide my unifi protect doorbell via homebridge to apple homekit. No problems there. But can't figure it out with go2rtc. Can anyone help please?
Did you look at Scrypted ?
Yes it works there too. Trying to figure how to add that functionality to home assistant. Any ideas?
The only way I found was using the Scrypted iframes.
Is this only for the paid scrypted nvr version?
This is correct.
If only go2rtc would provide a working stream for 2-way audio. I want to use a open source and free method. Can you think of any other way to use doorbel g4 pro in Home Assistant for 2-way audio?
I mean it is possible. Homekit and Scrypted works fine...
Looks like HomeBridge uses ffmpeg to transcode the stream and passes it to the web socket.
https://github.com/hjdhjd/homebridge-unifi-protect/blob/87fc75de4671ded6c0f5a7eb194c5721b8f35818/src/protect-stream.ts#L797
It stands to reason the same approach could be used. Some of the magic values could probably be copied from there.
Hello,
Did anyone figure this out ?
BR, Jens
Very interested in this as well. I wish there is an easy way to integrate this with HA, and without paying for Scrypted. These cameras already come with Unifi Protect NVR, I don't want to pay for another NVR!
I'm interested in this for the standalone docker image. I'll dig into the code a bit when I have time. Subscribed for now.
FYI Its now possible to start a talk-back session through official Unifi Protect API
curl --http1.1 -v -k -X POST 'https://192.168.1.1/proxy/protect/integration/v1/cameras/681e457f03b8dc03e4256d97/talkback-session' -H "X-API-KEY: $UNIFI_API_KEY" -H 'Accept: application/json'
> POST /proxy/protect/integration/v1/cameras/681e457f03b8dc03e4256d97/talkback-session HTTP/1.1
> Host: 192.168.1.1
> User-Agent: curl/8.10.1
> X-API-KEY: <redacted>
> Accept: application/json
>
< HTTP/1.1 200 OK
< Server: nginx
< Date: Thu, 15 May 2025 11:54:00 GMT
< Content-Type: application/json; charset=utf-8
< Content-Length: 88
< Connection: keep-alive
<
{
"url": "rtp://192.168.1.56:7004",
"codec": "opus",
"samplingRate": 24000,
"bitsPerSample": 16
}
Seems to be working. Looks like it always returns same URL/params (camera IP-address/port and params). Maybe its not necessary to call that endpoint every time you need a two-way audio.
🎉
Seems like that URL is working at any time (without need to call API endpoint prior), and i can actually send audio at any moment of time, and it is actually played:
$ ffmpeg -re -i sample-9s.wav -ar 24000 -sample_fmt s16 -vn -acodec libopus -f rtp rtp://192.168.1.56:7004
ffmpeg version n7.0.2 Copyright (c) 2000-2024 the FFmpeg developers
built with gcc 14.2.1 (GCC) 20240910
configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-amf --enable-avisynth --enable-cuda-llvm --enable-lto --enable-fontconfig --enable-frei0r --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libdav1d --enable-libdrm --enable-libdvdnav --enable-libdvdread --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libharfbuzz --enable-libiec61883 --enable-libjack --enable-libjxl --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libplacebo --enable-libpulse --enable-librav1e --enable-librsvg --enable-librubberband --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpl --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-nvdec --enable-nvenc --enable-opencl --enable-opengl --enable-shared --enable-vapoursynth --enable-version3 --enable-vulkan
libavutil 59. 8.100 / 59. 8.100
libavcodec 61. 3.100 / 61. 3.100
libavformat 61. 1.100 / 61. 1.100
libavdevice 61. 1.100 / 61. 1.100
libavfilter 10. 1.100 / 10. 1.100
libswscale 8. 1.100 / 8. 1.100
libswresample 5. 1.100 / 5. 1.100
libpostproc 58. 1.100 / 58. 1.100
[aist#0:0/pcm_s16le @ 0x623e538dcb00] Guessed Channel Layout: stereo
Input #0, wav, from 'sample-9s.wav':
Duration: 00:00:09.59, bitrate: 1411 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> opus (libopus))
Press [q] to stop, [?] for help
[libopus @ 0x623e538aae40] No bit rate set. Defaulting to 96000 bps.
Output #0, rtp, to 'rtp://192.168.1.56:7004':
Metadata:
encoder : Lavf61.1.100
Stream #0:0: Audio: opus, 24000 Hz, stereo, s16, 96 kb/s
Metadata:
encoder : Lavc61.3.100 libopus
SDP:
v=0
o=- 0 0 IN IP4 127.0.0.1
s=No Name
c=IN IP4 192.168.1.56
t=0 0
a=tool:libavformat 61.1.100
m=audio 7004 RTP/AVP 97
b=AS:96
a=rtpmap:97 opus/48000/2
a=fmtp:97 sprop-stereo=1
[out#0/rtp @ 0x623e538dcc80] video:0KiB audio:119KiB subtitle:0KiB other streams:0KiB global headers:0KiB muxing overhead: 4.756492%
size= 125KiB time=00:00:09.58 bitrate= 106.9kbits/s speed=1.06x
Its funny thats there is no authentication/authorization at all.
Basically anyone who has networking access to camera/doorbell can play any sounds/music at any time.
Well, i was wrong:
- Just tried with another camera.
- You need to call that endpoint at least once to play audio thru RTP.
- Without calling endpoint you still can send audio, but it won't be played.
- After calling endpoint you can stream audio at any moment of time (it doesn't seems to expire even after 15 minutes).
- Audio is played with almost 0 delay.
I have feeling that two-way audio can be implemented with just super-tricky ffmpeg_source/’exec’ config - we would need to somehow combine both ffmpeg RTSP input video/audio stream and RTP audio output stream
I wonder if Stream #0:1: Audio: opus, 48000 Hz, stereo, fltp is actually a talk-back channel 🤔
Somehow made it work with exec:
exec:ffmpeg -re -i rtsps://192.168.1.1:7441/aSrjYl35MXKCsF22 -i pipe: -map 0:0 -map 0:1 -map 0:2 -c copy -rtsp_transport tcp -f rtsp {output} -map 1:0 -ar 24000 -vn -acodec libopus -ar 24000 -sample_fmt s16 -b:a 32k -f rtp rtp://192.168.2.125:7004#backchannel=1
Summary:
- Reads camera RTSP stream as Stream 0
- Reads audio from stdin as Stream 1
- From Stream 0 writes output to {output} as RTSP
- From Stream 1 writes to Camera's RTP audio stream with transcoding to
opusaudio coded
Obviously it doesn't work very well with Raspberry PI hardware 😒
After some testing seems like following config works much better (go2rtc automatically route streams itself):
streams:
test:
- rtspx://192.168.1.1:7441/aSrjYl35MXKCsF22#backchannel=0
- "exec:ffmpeg -re -fflags nobuffer -f s16be -ar 8000 -i - -vn -acodec libopus -ar 24000 -sample_fmt s16 -b:a 32k -f rtp rtp://192.168.2.125:7004#backchannel=1"