mediamtx icon indicating copy to clipboard operation
mediamtx copied to clipboard

Issues with Moonfire NVR

Open hn opened this issue 1 year ago • 9 comments

Which version are you using?

v0.19.3

Which operating system are you using?

  • [ ] Linux amd64 standard
  • [ ] Linux amd64 Docker
  • [ ] Linux arm64 standard
  • [x] Linux arm64 Docker
  • [ ] Linux arm7 standard
  • [ ] Linux arm7 Docker
  • [ ] Linux arm6 standard
  • [ ] Linux arm6 Docker
  • [ ] Windows amd64 standard
  • [ ] Windows amd64 Docker (WSL backend)
  • [ ] macOS amd64 standard
  • [ ] macOS amd64 Docker
  • [ ] Other (please describe)

Describe the issue

When using Moonfire NVR as client, the connection breaks every 30 seconds. It seems that the SET_PARAMETER call used for keep alives is the root cause.

Describe how to replicate the issue

No special config needed, simply use Moonfire NVR as client, e.g. as described in related bug report https://github.com/scottlamb/moonfire-nvr/issues/234

Did you attach the server logs?

yes, please see debug log:

08:14:51 DEB [RTSP] [conn 172.18.0.2:32848] [c->s] SET_PARAMETER rtsp://rtsp-ss:8554/garagedoor_main/ RTSP/1.0
CSeq: 4
Session: 3130046022
User-Agent: Moonfire NVR 0.7.4

08:14:51 DEB [RTSP] [conn 172.18.0.2:32848] [s->c] RTSP/1.0 400 Bad Request
CSeq: 4
Server: gortsplib

08:14:51 INF [RTSP] [conn 172.18.0.2:32848] closed (unhandled request: SET_PARAMETER rtsp://rtsp-ss:8554/garagedoor_main/)
08:14:51 INF [RTSP] [conn 172.18.0.2:32854] opened
08:14:51 DEB [RTSP] [conn 172.18.0.2:32854] [c->s] TEARDOWN rtsp://rtsp-ss:8554/garagedoor_main/ RTSP/1.0
CSeq: 1
Session: 3130046022
User-Agent: Moonfire NVR 0.7.4

08:14:51 DEB [RTSP] [conn 172.18.0.2:32854] [s->c] RTSP/1.0 200 OK
CSeq: 1
Server: gortsplib

08:14:51 INF [RTSP] [session 778887791] destroyed (teared down by 172.18.0.2:32854)
08:14:51 INF [RTSP] [conn 172.18.0.2:32854] closed (EOF)

Unlike in https://github.com/aler9/rtsp-simple-server/issues/842, a session header is included in the keep alive. Therefore I do not understand that the response is "400 Bad Request" and the connection is closed.

Did you attach a network dump?

no

hn avatar Aug 03 '22 13:08 hn

With @r0l1's patch https://github.com/aler9/gortsplib/pull/102/commits/212b081dcfdcaee1f32093f5311abec65e79e871 (only the serverconn.go part), the SET_PARAMETER call is answered with 200 OK and the connection stays stable. However, after 60s ErrServerNoRTSPRequestsInAWhile kicks in ("no RTSP requests received in a while") and closes the (UDP) connection. If I switch the connection type to TCP everything works as expected and there are no disconnects at all. This patch+workaround is fine for me for now.

hn avatar Aug 05 '22 06:08 hn

Hello, This is caused by multiple issues inside Moonfire's RTSP library (https://github.com/scottlamb/retina), in particular:

  • SET_PARAMETER keepalives may be used only if the server supports them and if the server declares support inside OPTION responses, i.e:

    OPTIONS rtsp://server-url RTSP/1.0
    
    RTSP/1.0 200 OK
    Public: OPTIONS, ANNOUNCE, PLAY, GET_PARAMETER, ...
    

    If SET_PARAMETER is not present inside the Public header, a client must not use it, i.e., this assumption is wrong:

    https://github.com/scottlamb/retina/blob/2b700fd32b726033bc764dd663ba453cb385d15c/src/client/mod.rs#L1907

    Of course the server can be patched to support SET_PARAMETER keepalives, but a client must always check for available methods before using them, therefore the library may work with this server but won't work with many others.

  • A client that uses the UDP transport protocol must send periodic keepalives; the interval of these keepalives must be taken from the session header, for example:

    RTSP/1.0 200 OK
    Session: abcd;timeout=60
    

    The interval of keepalives must be < 60. Therefore, it's wrong to hardcode the keepalive interval:

    https://github.com/scottlamb/retina/blob/2b700fd32b726033bc764dd663ba453cb385d15c/src/client/mod.rs#L42

Please report these issues to the retina developers.

aler9 avatar Aug 05 '22 08:08 aler9

Thank you @aler9. Your detailed explanation will certainly help a lot to improve things.

hn avatar Aug 05 '22 09:08 hn

Hi. Moonfire/Retina developer here.

SET_PARAMETER keepalives may be used only if the server supports them and if the server declares support inside OPTION responses

Not sure i agree. I don't actually care if the request succeeds (setting no parameters, as there's no body anyway) or fails with a 501 Not Implemented, so long as it extends the session time and doesn't drop the connection (more below). So OPTIONS doesn't really tell me anything I care about; even an unsupported method would be fine with most servers.

If SET_PARAMETER is unsupported, is there any other RTSP method you accept as a keepalive? chose SET_PARAMETER because the ONVIF spec recommended it.

Do you accept RTCP receiver reports as keepalives?

I believe the problem here is that rtsp-simple-server is dropping the connection after sending the 501. When using interleaved channels, this effectively kills the session. Why does it do that? (Even when using UDP, Moonfire/Retina doesn't gracefully handle loss of the RTSP connection right now. I know it should, but I haven't gotten around to it yet.)

Of course the server can be patched to support SET_PARAMETER keepalives, but a client must always check for available methods before using them, therefore the library may work with this server but won't work with many others.

What others? Retina's been tested with maybe a couple dozen implementations by now and hasn't had any problems with its keepalives.

it's wrong to hardcode the keepalive interval

Yeah, I can switch that. Is it relevant? What timeout are you using?

scottlamb avatar Aug 05 '22 12:08 scottlamb

Hello, this is the procedure to send RTSP keepalives from a RTSP client to a RTSP server:

  1. when receiving the OPTIONS response, you have to annotate whether the server supports GET_PARAMETER
  2. When sending keepalives, you have to use GET_PARAMETER if it is supported, otherwise OPTIONS

This procedure is implemented by FFmpeg, GStreamer, VLC and rtsp-simple-server: https://github.com/FFmpeg/FFmpeg/blob/a78f136f3fa039fd7ad664fd6e6e976f1448c68b/libavformat/rtspdec.c#L947 https://github.com/GStreamer/gst-plugins-good/blob/20bbeb5e37666c53c254c7b08470ad8a00d97630/gst/rtsp/gstrtspsrc.c#L5366 https://github.com/aler9/gortsplib/blob/7d4da47da4159433101bc49d16ddb36d398916c1/client.go#L551

Regarding timeout, i can assure you that there are cameras with timeouts < 30secs, and that taking the parameter from the Session header is the only reliable way to get a stable connection.

aler9 avatar Aug 05 '22 12:08 aler9

this is the procedure to send RTSP keepalives from a RTSP client to a RTSP server

That's the procedure those implementations use, sure. It's not what the ONVIF streaming spec recommends, and it's not mandated in the RTSP RFC, so I don't think it's the only valid approach.

If I sent GET_PARAMETER, would everything have worked smoothly? I'm confused because the linked https://github.com/aler9/gortsplib/commit/212b081dcfdcaee1f32093f5311abec65e79e871 is described as "added support for SET_PARAMETER & GET_PARAMETER keep-alives". I haven't read through it yet, but that certainly suggests GET_PARAMETER is not supported now (despite being in the Options response you quoted).

Regarding timeout, i can assure you that there are cameras with timeouts < 30secs, and that taking the parameter from the Session header is the only reliable way to get a stable connection.

Fair enough, and I'll fix my implementation there when I get a chance. But is it relevant to why this particular session failed? What timeout is rtsp-simple-server using?

scottlamb avatar Aug 05 '22 13:08 scottlamb

Current RTSP 1.0 specification (https://www.rfc-editor.org/rfc/rfc2326.html) clearly states that

Some servers may not support setting stream parameters and thus
not support GET_PARAMETER and SET_PARAMETER.

If you're referring to the RTSP 2.0 specification (https://www.rfc-editor.org/rfc/rfc7826.html) it's not fully implemented by any major client and i think it will never be, since vendors have agreed to develop and support a new protocol (SRT).

Anyway, it's the server that decides the keepalive method, not the client.

If I sent GET_PARAMETER, would everything have worked smoothly?

yes, as long as you're also including the Session header

the linked https://github.com/aler9/gortsplib/commit/212b081dcfdcaee1f32093f5311abec65e79e871 is described as "added support for SET_PARAMETER & GET_PARAMETER keep-alives"

This is a workaround not included inside the main branch

But is it relevant to why this particular session failed? What timeout is rtsp-simple-server using?

Probably not since this server uses 60 seconds.

BTW, let me give you an advice regarding RTSP: the problem of the protocol is that it's implemented (badly) into a lot of embedded hardware, that can't be changed. There are a lot of constraints that aren't specified inside the RFC but have become common practice since most vendors have chosen to adapt them. Therefore, you can chose to either support a RFC compliant and strict version of the protocol, that may work with software that is actively maintained and can be changed (like this server), or to relax constraints in order to support most cameras.

This is unfortunately also true with protocols that follows RFCs more strictly, like WebRTC: open source developers (like the one of the pion project) tried to follow the specifications, but since Chrome/Chromium has an enormous market share, pion has been repeatedly adapted in order to better support Chromium.

aler9 avatar Aug 05 '22 16:08 aler9

Some servers may not support setting stream parameters and thus not support GET_PARAMETER and SET_PARAMETER.

It also says this:

       If the recipient of the message does
       not understand the request, it responds with error code 501 (Not
       implemented) and the sender should not attempt to use this method
       again. A client may also use the OPTIONS method to inquire about
       methods supported by the server. The server SHOULD list the
       methods it supports using the Public response header

My expectation had been there shouldn't be any harm from sending an unsupported method. This has been true of all the servers I've tried until now. Note the lower case of the "should" not attempt to use those method again; I paid little attention to that optional advice since I'm not actually trying to set a parameter, just keep the connection warm...

simple-rtsp-server is not just failing the request (400, 501, whatever) but also dropping the connection. With interleaved channels, that's a destructive action. I'd strongly advise changing this behavior.

BTW, let me give you an advice regarding RTSP: the problem of the protocol is that it's implemented (badly) into a lot of embedded hardware, that can't be changed

Yeah I've also learned the same thing about RTSP implementations in general and have had to add various workarounds. I'm open to making changes to my implementation. I'm just surprised by your "this is the one way to do it" when I've seen contrary advice from a standard body.

If you're referring to the RTSP 2.0 specification (https://www.rfc-editor.org/rfc/rfc7826.html) it's not fully implemented by any major client and i think it will never be, since vendors have agreed to develop and support a new protocol (SRT).

I'm not. My code is intended to be purely RTSP/1.0, not 2.0. apologies for not being more precise before. I'm on my way to the airport now and a bit rushed... I'm typing this comment on my phone.

(Although oddly enough I've noticed the onvif streaming spec refers to the 2.0 rfc in a couple places to clarify expectations, despite using 1.0 otherwise). It's frustrating that the rfc writers have essentially abandoned the RTSP/1.0 errata in favor of this incompatible protocol no one will ever use and folks like us are stuck reading tea leaves...)

scottlamb avatar Aug 05 '22 17:08 scottlamb

simple-rtsp-server is not just failing the request (400, 501, whatever) but also dropping the connection. With interleaved channels, that's a destructive action. I'd strongly advise changing this behavior.

The server will be adapted in order to return 501 and not not drop the connection.

aler9 avatar Aug 05 '22 17:08 aler9

The server has been adapted in order to return 501 in case of unhandled methods, starting from v0.20.0.

aler9 avatar Aug 16 '22 17:08 aler9

Is there a way to change rtsp simple server timeout? I noticed that many times, vlc sends the GET_PARAMETER just a second after rtsp simple server already disconnects.

RobotnickIsrael avatar Nov 03 '22 09:11 RobotnickIsrael

This issue is being locked automatically because it has been closed for more than 6 months. Please open a new issue in case you encounter a similar problem.

github-actions[bot] avatar May 08 '23 15:05 github-actions[bot]