MPD
MPD copied to clipboard
ffmpeg decoder can depend on InputStream's URI
Feature request
Could InputStream::GetURI()
be virtual/mutable or otherwise expose a new way to customize the current URI (maybe via ReadTag
) for decoders to use? The actual remote resource's URI may not necessarily be known until after the InputStream
constructor, particularly when using a late-initialized ProxyInputStream
.
Rationale
FfmpegOpenInput()
passes the InputStream
's URI to ffmpeg/lavf open: FfmpegDecoderPlugin.cxx. Normally this isn't used for IO, and the AVIOContext callbacks (and InputStream::Read
/etc) are used instead. However it appears that in certain scenarios the URI is actually used to access the resource: One example is HLS where a playlist is repeatedly refreshed by ffmpeg to check for the latest chunks. The refreshes do not currently go through InputStream::Rewind
, and instead ffmpeg initiates new connections directly to the url.
This was discovered when working on the youtube-dl plugin, where the user-facing web URL often isn't the same as the actual direct URI to the content being played. The real URI is known only after InputStream
constructor, but before SetReady()
.
Why is it important for you to have the effective URL returned by GetURI()
? That should be opaque information, shouldn't it?
An example scenario with this plugin is:
-
mpc add plugin://https://twitch.tv/monstercat
,mpc play
- wanting to play an HLS radio stream - A
ProxyInputStream
is constructed with the user-facing URI, no web requests have been made yet - The final URI to play is determined with an asynchronous request, it resolves to
https://example.cdn.net/v1/playlist/whatever.m3u8
($(youtube-dl -f bestaudio -g https://twitch.tv/monstercat)
if you want a real example) -
SetInput(OpenCurlInputStream(new_uri))
- eventually
SetReady()
happens on the underlying curl stream... -
FfmpegOpenInput(input.GetURI())
the decoder thread starts, giving ffmpeg the wrong/original/old URI and not the one we want it to use - The HLS stream begins to play, getting the initial playlist from
InputStream::Read()
. All is well until a few seconds later... - ffmpeg internally tries to refresh the m3u8 playlist and makes a request to
twitch.tv
instead ofexample.cdn.net
, and fails and stops the stream.
The decoder uses the URI, but it's stale information and it doesn't have a way to request the latest known URI that it needs to be using.
I still don't understand. Why would FFmpeg care about the URI? This is opaque, because FFmpeg does not actually open that URI. Instead, it uses the InputStream contents.
So that's the problem, it ideally would be using the InputStream but... ffmpeg is using the URI to make additional playlist reload requests. It needs to continually refresh the HLS playlist, and as far as I can tell it's making new requests each time. It seems to support keepalive and can reuse the connection, but not through the AVIOContext
. It only uses the InputStream for the initial read of the playlist data.
- Relevant code is around: https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/hls.c#L729
- aviocontext is discarded on refreshes? https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/hls.c#L1447
- It's probably possible to intercept this using
URLProtocol
which seems to be the http equivalent toAVIOContext
, but I imagine that would require a lot of restructuring.
You can watch this happen by just adding the raw URL using the normal curl input plugin: mpc add $(youtube-dl -f bestaudio -g https://twitch.tv/monstercat)
. If you check with debug logging/tracing/etc, it reads the InputStream once, then starts making its own additional requests to refresh it. It's also worth noting that since the InputStream data is just a playlist of URLs, none of the actual .ts
data requests go through InputStream
s at all?
So this is maybe partially a bug report because ffmpeg probably shouldn't be using the URI... but since it currently does... The feature request here is that the InputStream needs some way to customize the URI being passed to ffmpeg so that the decoder doesn't break.
On 2020/10/20 19:00, arcnmx [email protected] wrote:
So that's the problem, it ideally would be using the InputStream but... ffmpeg is using the URI to make additional playlist reload requests. It needs to continually refresh the HLS playlist, and as far as I can tell it's making new requests each time. It seems to support keepalive and can reuse the connection, but not through the
AVIOContext
.
Uh-oh, this means FFmpeg circumvents MPD's InputStream API and the plugins, and uses its own blocking I/O (which cannot be canceled, for example). That's bad. That's not a code path we should design for.
from a developer on IRC:
04:00 <+JEEB> I think my earlier comment specifically noted that
04:00 <+JEEB> teh whole thing around protocols and people not knowing of rw_timeout etc
04:01 <+JEEB> oooh yea, HLS meta-demux
04:01 <+JEEB> yea those things suck because they break the abstraction layers
04:02 <+JEEB> the I/O IIRC can be canceled but it could seem like it can't be due to not all
modules handling the cancel call-back/flag correctly
04:02 <+JEEB> but that's not the problem
04:03 <+JEEB> the problem is that "lol HLS demux just opens stuff around"
04:04 <+JEEB> I wish more people actually maintained that module and the writer module
04:05 <+JEEB> but if there's clearly not a ticket about that, and you can generate a simple API
client example for it, please file a ticket for it so it's actually mentioned
somewhere
@neheb translate to English, please
It's just a random developer confirming what you said.
If there's anything that this random developer can teach me about I/O cancellation in FFmpeg, I'd like to hear about it. Cancellation is important, and works only with non-blocking APIs, which FFmpeg does not provide at all. But maybe I'm wrong. (Unlike CURL and libnfs, for example - all HTTP and NFS I/O initiated by MPD through CURL can be canceled instantly, which is why pressing "stop" always works with no delay, no hangs, no wait.) The text you pasted talks about cancellation, but my brain fails to extract any usable information from it.
- Yes, unfortunately there are no async avformat, avcodec or avfilter APIs yet.
-
AVIOInterruptCB
is the way to currently interrupt any I/O done by an AVFormatContext. This may have bugs and unfortunate consequences, but the thing has been there since circa 2011 and is referenced by various core bits (avio.c/network.c etc) as well as these dash/hls meta-demuxers, so I'd expect it to be usable outside of bugs. - Yea, sorry about the meta-things opening their own AVIO things, this indeed is quite sub-optimal if you're attempting to provide your own AVIOContext solutions.
Thanks @jeeb, indeed I was not aware of AVIOInterruptCB
, and it may be helpful to reduce some of the pain of blocking I/O when working with FFmpeg. I will implement that in MPD.
There are various problems with AVIOInterruptCB
; for example, this is not real cancellation - I can only tell the library to stop doing something when the library feels like asking me. I have no idea when and how often this happens.
For example, libavformat/network.c
lets the CPU wake up every 100ms to ask the callback. It wakes up 10 times a second even if there is no callback. Doing so forces FFmpeg to always wake up the CPU while waiting for I/O these 10 times a second. What's the point of wasting so many wakeups - FFmpeg could do so much better. Like using an eventfd (or a pipe for non-Linux systems) to wake up the poll()
at any time, without periodic wakeups.
(Anecdote: I joined the MPD project 12 years ago because I wanted to eliminate MPD's unnecessary main loop wakeups - it was more than 100 times a second. The strace looked horrible.)
And FFmpeg's approach requires me to run the FFmpeg functions in a separate thread, because it's still a blocking API, and I can't go on with application business in the same thread.