arlo icon indicating copy to clipboard operation
arlo copied to clipboard

Feature Request: Push-to-talk

Open sophiedankel opened this issue 6 years ago • 19 comments

Several models of Arlo cameras support push-to-talk: https://kb.arlo.com/1004319/What-is-the-push-to-talk-feature-on-my-Arlo-camera-and-how-does-it-work

I would like to extend this python library to be able to send audio to my camera and have it play from the camera's speakers.

sophiedankel avatar Mar 16 '18 22:03 sophiedankel

Hi @sophiedankel thanks for opening this issue. What kind of Arlo cameras do you have that have this feature?

In order to get this working, we need to figure out what HTTP requests your browser makes when you use this feature from the web UI. To do that, you can open the Chrome devtools (Chrome is best for this task). Click on the Network tab, check "Preserve log", and log into the Arlo website. image

Once you've done that, you need to exercise the push-to-talk feature and capture what HTTP requests are being made. Once you do that, you can paste them here and we can go from there. (If you're not sure about how to do this, I have a Slack channel that we can jump on and I can walk you through it via a screen share if you'd like.

jeffreydwalter avatar Mar 17 '18 18:03 jeffreydwalter

FYI - I have Arlo Pro 2 cameras that support this. Need to check if it supports it via the web tho…

abritinthebay avatar Mar 24 '18 04:03 abritinthebay

I have Arlo Pro 2 cameras and Chrome...

Hit microphone button in Live streaming view: Request URL: https://arlo.netgear.com/hmsweb/users/devices/XXX-XXXXXXX_CCCCCCCCCC/pushtotalk Request Method: GET

XXX-XXXXXXX is user id, CCCCCCCCCC is camera id...

Then a new picture of a microphone pops up just below the live stream and you need to press / hold it down to talk:

I think at this point google analytics GET event is performed (not familiar with this so not sure what you need me to paste in here if anything...)

When you release the button, another google analytics GET is performed...

HTH...

jvigilan avatar Mar 24 '18 15:03 jvigilan

Google Analytics clicks are only for tracking / analysis of what you're doing.

Chrome somehow needs to transfer the audio to arlo.netgear.com or some other netgear server. Can you see anything like that?

shoeper avatar Mar 27 '18 19:03 shoeper

@jvigilan what happens when you press the press/hold the talk button? As @shoeper said, you can ignore the google analytics stuff. Do you see any new HTTP requests? What about events in the EventStream?

When you log in, you should see a call to /subscribe, like this. If you click on that request, you will see an "EventStream" tab in the right-hand pane (see the screenshot). If you click on that tab, you will see a all of the events your browser has received from the Arlo servers. When you click the microphone button, what events do you see?

image

If you're interested, I've got a Slack channel. We can jump on there and reverse engineer it over a screen share real quick (when we both have half an hour or so).

jeffreydwalter avatar Mar 27 '18 21:03 jeffreydwalter

@jeffreydwalter watched the EventStream as suggested and there are 3 'pushToTalk' actions everytime I hit the microphone. No new messages while I hold down the microphone. A new message arrives when I turn off the microphone (Close X). I attached a text file with the messages from the Chrome DevTools. If you need more, I can certainly jump on a Slack channel session. Arlo_Push_to_Talk.txt

jvigilan avatar Mar 27 '18 23:03 jvigilan

@jvigilan I should have some time this week to jump on Slack if you still want to. Let me know.

jeffreydwalter avatar May 06 '18 23:05 jeffreydwalter

@jeffreydwalter yes, I can meet this week... Tuesday is not good nor is Thursday afternoon / evening...when is good day / time for you?

jvigilan avatar May 07 '18 11:05 jvigilan

How about Wednesday afternoon, say 1 or 2 pm CST?

jeffreydwalter avatar May 07 '18 16:05 jeffreydwalter

Hi Jeffrey – 2:00 CST on Wednesday works for me…. Assume you will send the slack channel invite…

Thanks, John Vigilante 219.921.6661 [email protected]

From: jeffreydwalter Sent: Monday, May 7, 2018 11:34 AM To: jeffreydwalter/arlo Cc: jvigilan; Mention Subject: Re: [jeffreydwalter/arlo] Feature Request: Push-to-talk (#52)

How about Wednesday afternoon, say 1 or 2 pm CST? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

jvigilan avatar May 07 '18 18:05 jvigilan

Awesome, that works fine for me. Here's the link. https://join.slack.com/t/arlo-dev/shared_invite/enQtMzYwNTczMzQ4NTgyLTdlZjgzZjc5NTdhOWZkYzg3MWQ5YzhkNTI4ODgzMmYyMmI3NjBjNjExY2U3MzM4YzljMGMzZDAxZjI0OWQ3Mjg

jeffreydwalter avatar May 07 '18 18:05 jeffreydwalter

Hi Guys,

Did u find any work around for this? @jeffreydwalter @jvigilan

sherifmka2004 avatar Jun 04 '18 16:06 sherifmka2004

@sherifmka2004 we looked into it a little bit, but I haven't had time to follow up.

jeffreydwalter avatar Jun 04 '18 17:06 jeffreydwalter

@jeffreydwalter I noticed these three messages are just when u click the speak button but then there is UDP stream afterwards.

If there’s something I can help with, let me know.

sherifmka2004 avatar Jun 04 '18 18:06 sherifmka2004

@sherifmka2004 I still haven't had time to dig into this, but it appears that they are using RTP (Real Time Protocol) which uses the SDP (Session Description Protocol), STUN (Session Traversal Utilities for NAT), and ICE (Interactive Connectivity Establishment) protocols for establishing the connection.

Here are some resources: 1. https://www.cs.columbia.edu/~hgs/rtp/faq.html 1. https://tools.ietf.org/id/draft-ietf-mmusic-ice-sip-sdp-14.html 2. https://tools.ietf.org/html/rfc4566 3. https://www.avaya.com/blogs/archives/2014/08/understanding-webrtc-media-connections-ice-stun-and-turn.html

The conversation goes about like this:

1. POST /users/devices/{unique_id}/pushtotalk

{"data":{"uSessionId":"XXXXXXXXXXXXX!2856F0D8!1525893890884","data":[{"url":"stun:relay01-z2-prod.vz.netgear.com:19302"},{"credential":"XXXXXXXXXXXXXXXXXXXXXXX/XXXXX=","url":"turn:relay01-z2-prod.vz.netgear.com:443?transport=tcp","username":"1525893901:XXX-XXXXXXX"},{"credential":"XXXXXXXXXXXXXXXXXXXXXXX/XXXXX=","url":"turn:relay01-z2-prod.vz.netgear.com:443?transport=udp","username":"1525893901:XXX-XXXXXXX"}],"type":"iceServers"},"success":true}

2. POST /notify
{"action":"pushToTalk","from":"XXX-XXXXXXX","publishResponse":true,"resource":"cameras/XXXXXXXXXXXXX","responseUrl":"","to":"XXXXXXXXXXXXX","transId":"web!98b0c88b!1429756137177","properties":{"uSessionId":"XXXXXXXXXXXXX!2856F0D8!1525893890884","type":"offerSdp","data":"v=0\r\no=- 2808742620419521074 2 IN IP4 127.0.0.1\r\ns=-\r\nt=0 0\r\na=group:BUNDLE audio\r\na=msid-semantic: WMS PFHb9mMwEp0ThE5Ruhsk1rFRtZTyAAGcPJsQ\r\nm=audio 9 UDP/TLS/RTP/SAVPF 111 103 104 9 0 8 106 105 13 110 112 113 126\r\nc=IN IP4 0.0.0.0\r\na=rtcp:9 IN IP4 0.0.0.0\r\na=ice-ufrag:QbPr\r\na=ice-pwd:4GjCKEJNq0N/fruvfRxBZ34V\r\na=ice-options:trickle\r\na=fingerprint:sha-256 EA:F1:38:6C:62:FD:AA:DD:E6:CA:1E:9D:0C:13:2F:5E:9C:3E:F0:2D:C9:93:AE:2E:D5:96:39:D5:93:1D:75:52\r\na=setup:actpass\r\na=mid:audio\r\na=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level\r\na=sendrecv\r\na=rtcp-mux\r\na=rtpmap:111 opus/48000/2\r\na=rtcp-fb:111 transport-cc\r\na=fmtp:111 minptime=10;useinbandfec=1\r\na=rtpmap:103 ISAC/16000\r\na=rtpmap:104 ISAC/32000\r\na=rtpmap:9 G722/8000\r\na=rtpmap:0 PCMU/8000\r\na=rtpmap:8 PCMA/8000\r\na=rtpmap:106 CN/32000\r\na=rtpmap:105 CN/16000\r\na=rtpmap:13 CN/8000\r\na=rtpmap:110 telephone-event/48000\r\na=rtpmap:112 telephone-event/32000\r\na=rtpmap:113 telephone-event/16000\r\na=rtpmap:126 telephone-event/8000\r\na=ssrc:1253177959 cname:o5ar6SxfzgjLyG1v\r\na=ssrc:1253177959 msid:PFHb9mMwEp0ThE5Ruhsk1rFRtZTyAAGcPJsQ b9d7d868-419f-4aef-9eee-dfdafb792147\r\na=ssrc:1253177959 mslabel:PFHb9mMwEp0ThE5Ruhsk1rFRtZTyAAGcPJsQ\r\na=ssrc:1253177959 label:b9d7d868-419f-4aef-9eee-dfdafb792147\r\n"}}

RESPONSE:
{"from":"XXXXXXXXXXXXX","action":"pushToTalk","resource":"cameras/XXXXXXXXXXXXX","properties":{"uSessionId":"XXXXXXXXXXXXX!B0ED31D6!1525894559869","type":"answerSdp","data":"v=0\r\no=NTGRMEDIA 19304 0 IN IP4 0.0.0.0\r\ns=-\r\nt=0 0\r\na=ice-ufrag:Y22lWpF5OlJfpj74\r\na=ice-pwd:13gZRE0mHTxAZYztTWbzdAI366vDtJpi\r\na=fingerprint:sha-256 C9:4D:E1:45:97:EE:C7:03:43:30:EA:C6:8B:3F:95:E1:FE:72:3B:14:99:60:30:3F:40:D3:04:26:6B:CD:1A:52\r\nm=audio 9 RTP/SAVPF 0 8 97 9 126\r\nc=IN IP4 0.0.0.0\r\na=charset:UTF-8\r\na=rtpmap:9 G722/8000\r\na=rtpmap:0 PCMU/8000\r\na=rtpmap:8 PCMA/8000\r\na=rtpmap:97 opus/48000/2\r\na=rtpmap:126 telephone-event/8000\r\na=recvonly\r\na=setup:active\r\na=rtcp-mux\r\na=candidate:1 1 udp 2130706687 192.168.0.12 37784 typ host\r\na=candidate:2 1 udp 2130706687 172.14.1.1 44437 typ host\r\n"},"transId":"XXXXXXXXXXXXX!a2718d25!1525894560323"} 

3. POST /notify
{"action":"pushToTalk","from":"XXX-XXXXXXX","publishResponse":false,"resource":"cameras/XXXXXXXXXXXXX","responseUrl":"","to":"XXXXXXXXXXXXX","transId":"web!98b0c88b!1429756137177","properties":{"uSessionId":"XXXXXXXXXXXXX!2856F0D8!1525893890884","type":"offerCandidate","data":"candidate:4172108666 1 udp 2122255103 2001::9d38:6ab8:18a4:197d:3f57:ffec 56848 typ host generation 0 ufrag QbPr network-id 4 network-cost 50"}}

RESPONSE:
{"from":"XXXXXXXXXXXXX","action":"pushToTalk","resource":"cameras/XXXXXXXXXXXXX","properties":{"uSessionId":"XXXXXXXXXXXXX!B0ED31D6!1525894559869","type":"answerCandidate","data":"a=candidate:4 1 udp 16777471 172.29.5.160 56243 typ relay raddr 98.206.61.240 rport 45318\r\n"},"transId":"XXXXXXXXXXXXX!faf118d6!1525894560659"} 


4. POST /notify
{"action":"pushToTalk","from":"XXX-XXXXXXX","publishResponse":false,"resource":"cameras/XXXXXXXXXXXXX","responseUrl":"","to":"XXXXXXXXXXXXX","transId":"web!98b0c88b!1429756137177","properties":{"uSessionId":"XXXXXXXXXXXXX!2856F0D8!1525893890884","type":"offerCandidate","data":"candidate:1645401754 1 udp 2122197247 2601:248:c100:e8d0:94e5:ba0a:548f:4679 56849 typ host generation 0 ufrag QbPr network-id 2"}}


5. POST /notfiy
{"action":"pushToTalk","from":"XXX-XXXXXXX","publishResponse":false,"resource":"cameras/XXXXXXXXXXXXX","responseUrl":"","to":"XXXXXXXXXXXXX","transId":"web!98b0c88b!1429756137177","properties":{"uSessionId":"XXXXXXXXXXXXX!2856F0D8!1525893890884","type":"offerCandidate","data":"candidate:3413779551 1 udp 2122131711 2601:248:c100:e8d0:5cc:7c02:9ed5:dbec 56850 typ host generation 0 ufrag QbPr network-id 3"}}

RESPONSE:
{"from":"XXXXXXXXXXXXX","action":"pushToTalk","resource":"cameras/XXXXXXXXXXXXX","properties":{"uSessionId":"XXXXXXXXXXXXX!B0ED31D6!1525894559869","type":"answerCandidate","data":"a=candidate:3 1 udp 1694499071 98.206.61.240 37784 typ srflx raddr 192.168.0.12 rport 37784\r\n"},"transId":"XXXXXXXXXXXXX!2e80caaa!1525894560352"} 

jeffreydwalter avatar Sep 13 '18 20:09 jeffreydwalter

@jeffreydwalter I have interest in this feature too, so I gave it a try last night. Here is my experience based on wireshark : When you open the push-to-talk icon, Following your http request in the thread above: After request 1, the client(laptop) knows where to go for a STUN server. Then STUN protocol is used to find a network path between client(laptop) and server(camera) through NAT in a series of binding request and response in UDP packets. Then, Request 2, 3, 4 and 5 utilized offer and answer model in SDP protocol to decide media codec for SRTP transmission, and transmit fingerprint. After all these, DTLS v1.0 protocol through UDP packets is used to exchange the key used for SRTP protocol.
Then the SRTP stream in UDP packets starts transmitting from the client to the server. When you close the push-to-talk icon, the stream stops.

The hard part for implementation would be the DTLS part, from my viewpoint.

kt9302 avatar Jan 01 '19 16:01 kt9302

@jeffreydwalter I'm new to your Arlo API and am trying to get this feature to work with my Arlo Pro 2 cameras. I see that this method has been included in the documentation, is it functional? if so, how would I go about implementing this so that I can talk through a camera from my code?

gshappell1 avatar Aug 07 '19 17:08 gshappell1

@kt9302 thanks for the info. I'm unfortunately too busy to contribute to the library right now. I'd be happy to advise and more than happy to accept pull requests. It looks like there are several Python DTLS libraries available, so it might be trivial to connect to the DTLS stream.

jeffreydwalter avatar Aug 14 '19 23:08 jeffreydwalter

@gshappell1 push-to-talk is not supported currently.

jeffreydwalter avatar Aug 14 '19 23:08 jeffreydwalter