shaka-player icon indicating copy to clipboard operation
shaka-player copied to clipboard

Seek never completes with target before buffer and inside small gap limit

Open btsimonh opened this issue 4 years ago • 22 comments

Have you read the FAQ and checked for duplicate open issues? yes

What version of Shaka Player are you using? 2.5.6

Can you reproduce the issue with our latest release version? yes

Can you reproduce the issue with the latest code from master? (I am not building the player from source)

Are you using the demo app or your own custom app? Custom

If custom app, can you reproduce the issue using our demo app? This issue would be difficult to reproduce using manual controls.

What browser and OS are you using? Windows 10, Chrome latest

For embedded devices (smart TVs, etc.), what model and firmware version are you using?

What are the manifest and license server URIs?

unencrypted

What did you do?

See This Repo for reproduction. seek to ~1 minute, then seek backwards 1s at a time (arrow left...) until the video freezes but the reported time continues to change by -1s.

What did you expect to happen? The video should not freeze. The player should load buffers and seek in video.

What actually happened?

When the requested time is before the start of the first buffer loaded by up to config.streaming.smallGapLimit, buffers are cleared but not loaded, and the seek does not complete, but the currentTime changes. (i.e. the video freezes). When the seeked time is < (time of last first buffer - smallGapLimit) or > (time of last first buffer), normal seeking is resumed. Note: I have not checked the exact numbers :).

btsimonh avatar Nov 19 '19 12:11 btsimonh

It's often that I can seek in the video, never see a 'seeked' event, and for shaka (and video element) to report a time which is not matching the time displayed in the video.

Question: Any advice on ideal media (segment size, etc.) and player config for best seeking response would be happily received. I'm really happy with HTML5 video with straight MP4, but now I need to support encryption, and it's killing my seeking performance, even locally served. - e.g. if anyone knows how to tune the player so that it allows frame seeks backwards with fewer 'stalls then rushes', please pipe up :).

btsimonh avatar Nov 19 '19 17:11 btsimonh

Playing some more, I have just noticed that in very slow reverse seeking, there are 'holes' where the video does not change, but reports a new time. When seeking frame fwd or frame back inside a 'hole', only a 'seeking' event is generated. Once you seek beyond the 'hole', you can seek back into it successfully. When seeking outside of a 'hole', you get seeking, volumechanged, seeked, canplay, canplaytrough.

for the 2 second segment media, the holes are 0.5s in size, and may occur every 3 (!!??) seconds.

(note: this seems to be config.streaming.smallGapLimit - setting this to zero appears to make very slow seek backwards work without holes?)

btsimonh avatar Nov 19 '19 17:11 btsimonh

I tried playing your repro page, served from a local server, and the video won't play at all. I'm getting errors that suggest that your webm is malformed in some way? I'm not an expert in the webm format, so I can't say for sure what the exact problem is, but it looks like the parser is expecting the first element to start with an ID of 0x1c53bb6b and it didn't? Something like that. Did you put the wrong media files in the github page or something?

theodab avatar Nov 22 '19 00:11 theodab

hmmm... that's really strange. Just now I downloaded as a zip from github, unzipped, then opened 'web server for chrome' (which is pointing to a folder some folders below the 'shakatest-master' folder), navigated to the folder. As soon as it hits the folder, it displays the index.html, I chose the '10s' version, and it loads the video fine.... Did you clone the repo? Maybe git is corrupting the webm on extraction? (could be line feed substitution if git does not recognise webm as binary?). (ahh.. also, I name the actual media webma and webmv - so git would definately not know it by extension! - some stupid idea I had to 'hide' those files when the user is browsing for video....).

btsimonh avatar Nov 22 '19 07:11 btsimonh

Turns out the problem was on my end. I was using the SimpleHTTPServer python module for making a local web server, and on further inspection the server was getting tons of broken pipe errors and such. I tried again with a different method, and it played as expected. So I think the web server was probably sending incomplete webm files or something.

Sorry for the confusion. I'll look into it.

theodab avatar Nov 22 '19 23:11 theodab

So, when you say that video "stalls", do you mean that it stops visually updating? Looking at your code, the "bad reverse" does cause the video to not change (mostly), but it will work fine the moment you stop reversing, so I'm not sure if I would call that a "stall" really.

Examining how your "bad reverse" works, it looks like what is happening is that you are repeatedly performing unbuffered seeks, without giving it any chance to buffer. You seek forward to 60s, and (as this is an unbuffered seek, as you are seeking far past the end of the buffer) Shaka Player clears the buffer and starts loading this new content. Then, before it finishes loading, you seek backwards. As there is no loaded content at all, this is technically an unbuffered seek, so it clears the buffer again and begins loading content again. And so on and so forth. Occasionally the image will update. This happens when Shaka Player manages to get a playable amount of content in 100ms.

About the "holes" you mention, I am not as sure, but I think what you are seeing is the segment boundaries? As we are buffering ahead but you are playing backwards, whenever you get to a new segment it has to be buffered. config.streaming.smallGapLimit happens to, among other things, control an optimization that causes us to be a bit more generous when trying to figure out when we need to load a new segment... but since you're playing backwards, that config value is having the opposite effect and making everything less generous, causing it to not buffer until a bit too late when it hits a segment boundary. That's just a guess though.

On a side note, have you tried to use trick play mode instead? That's our first-party support for rewinding (and also fast forwarding).

theodab avatar Nov 25 '19 23:11 theodab

Yes, the test is agressive and abusive to the system :).

ok, revisited today.

For info only, skip to 'I have changed...' below..... The 'stall' I see is where the video stops updating visually, but the seek bar and currentTime are updated. Your explanation makes complete sense; but not sure how the version served from gihub (https://btsimonh.github.io/shakatest/) achieves seeks more often, but when served locally very much less often; maybe the caching headers from the server are different, and so it's actually MORE efficient when the same segment is request multiple times?

I think maybe it's the 'holes' which threw me - but previously I definitely had times where the video was stopped, and the time and time displayed mismatched; maybe I just happened to stop seeking at a time when the video time was in one of these 'holes', and so the video did not update. This then lead me to try the 'reverse' thing to see if I could reproduce with something simple, and I think I interpreted the lack of display as the same issue?

That the behaviour is avoided by not seeking if performingUpdate is true for either stream is obvious now - this basically gives the player time to finish loading the segments before another seek is done.. and then the next seek is likely to be a buffered seek.

I have changed the report to 'unbuffered seek into a gap created by smallGapLimit never completes'. I also modified the first comment to reflect the new thoughts. This is very easily reproduced, e.g. by setting an unreasonably large smallGapLimit:

var config = {
    abr: {
      enabled:false,
    },
    streaming:{
      smallGapLimit:9,
    }
  };

Then on the test page, just seek on the timeline to ~1m, then manually seek bwd (arrow left will go back 1 sec) - once you get beyond the start of the buffers, the seek will not complete until you are 9s before the start of buffers...

I have modified the test repo accordingly, commented out the interval based seek, and modified the readme. The issue can be easily reproduced served from remote or local - updated on https://btsimonh.github.io/shakatest/ now.

I didn't consider trick play yet; my thought being that since I'm serving locally, actually my data is available very quickly.

Q: I note the suggested caching mechanism using service_worker. Is there any thought to cache segment data locally - it seems that in an unbuffered seek the segment data is always discarded completely and re-requested, whereas in my case of stepping backwards, for example, off the start of the available segments, the player would benefit from knowing it has all except one of the required segments to hand?

btsimonh avatar Nov 26 '19 09:11 btsimonh

Q: I note the suggested caching mechanism using service_worker. Is there any thought to cache segment data locally - it seems that in an unbuffered seek the segment data is always discarded completely and re-requested, whereas in my case of stepping backwards, for example, off the start of the available segments, the player would benefit from knowing it has all except one of the required segments to hand?

We don't use the service worker to cache segment data. By design, we only cache the application itself in the service worker. For offline content, we use IndexedDB to store segments for offline consumption.

I can confirm your report on using an unreasonably large smallGapLimit and seeking backward into the gap. But the key here is unreasonably large. The purpose of smallGapLimit is to inform the player that it should not jump over gaps smaller than this, because you would expect the browser to recover from such a small gap automatically. In this case, with a large enough smallGapLimit, of course the browser will not play through it.

So for a large value of smallGapLimit, this is working as expected. There are many ways you could configure Shaka Player to break things, and it's not a bug in Shaka Player if you provide a nonsensical configuration, such as bufferingGoal of 0, or rebufferingGoal of Infinity.

Does this make sense?

joeyparrish avatar Jan 23 '20 20:01 joeyparrish

hi @joeyparrish ,

Thanks for reviewing this issue. Sorry if this is a wall of text :).

Agreed 9s smallgaplimit is unreasonably large; but only to easily reproduce the behaviour. I believed the behaviour is still present for small values.

The issue is that if a seek aims for a time which is < smallgaplimit before the first segment, then the seek does not complete, however small the smallgaplimit - by design the player does not load the required segment, and all other segment data has been discarded (is this true?), so the browser cannot recover?. I was observing this behaviour with smallgaplimit at the default value. It was just more difficult to reproduce. (you should be able to reproduce with a local copy of the test - change the smallgaplimit to an 'expected' value, seek to 1m, step back to 00:00:40:10 using left arrow, then click the button marked 'back' to frame step bwd. When you get to 00:00:40:08, the video will stop updating. When doing this on the current test page - with large smallgaplimit - , play seemed to still work up to a few frames before 00:00:40:08, but then stops working as you seek back further - I may be mistaken, not necessarily reproducible - , even though the browser progress bar indicates data is still present. As you step back further, the browser progress bar indicates the data has been discarded. - i guess this is the region that you highlight that the browser would 'play through', and smallgaplimit must be smaller than? - unfortunately I can't do this test myself right now, new laptop, not setup for this yet :( )

So, I could expect this to affect anyone who steps back in a video, and is unlucky enough to click on the right portion of the timeline.... sounds unlikely but actually if buffering 30s or so, probably more common than you would think, as this is in the likely range people would step back if they missed a bit :). What they would see is that the video did not change (e.g. they expected to move back 10s, and it apparently does nothing), and as described above, Play may or may not work.

As a 'normal' user, I would expect any seek to complete (after some short delay, and would prefer that it hit the position seeked, although a 'normal' user would probably not complain if it seeked instead the start of the present segment, which could be a short term 'fix' - but would compromise the player for my use case). Note also that after the failed seek, play is also inoperable - but maybe only for unreasonable smallgaplimit?

I look forward to your feedback on reproducing the behaviour with small values; The subtle interaction between the various times, and the browser load times is incredibly complex; I definitely had some kind of issue with smallgaplimit at default - but whether it's exactly the same issue as having a large value causes, I could not be sure - all I can say is that by increasing the value, an issue was easily reproducible which felt the same... Note: I also remember seeing significant differences between local and remote serving when using small values, not sure why. Pretty sure I found it more difficult to reproduce with remote serving, something related to the latency I suspected.

thanks again,

Simon

fyi

My user requirements are a little unusual, as the user is continually stepping forward and back in the video (reviewing for compliance, preparing subtitles, noting metadata, etc.), so it would hit my users much more often. It's some time since I reviewed the player (which I will start to use for some users once I have DRMed video in place), but I'm sure I found some sort of workaround at the time (setting smallgaplimit to zero maybe?).

On the caching issue, If I get time, I will investigate some changes to only forget the segments once they have been played completely - or even keep them in the JS domain for a period of time/size. At the moment, they are forgotten as soon as they have been sent to the browser, which means any seek bwd to before the current segment causes re-requests (which come from browser cache) of data which could have been already present - i'm sure that I won't get a huge performance boost - it still has to request at least one segment, and then post the new sequence into the browser player in the right order, which one assumes the browser has to parse - but just observing the activity on a seek bwd reveals a tangled web of complexity which could be avoided. But then again, the code complexity of that bit may defeat me :). I will investigate the offline caching to see if that would benefit my users as well. I completely understand that my caching thoughts go well beyond the 'normal requirements' for a player designed to reliably watch material, and would only benefit my type of minority use case. However, this is all off-topic for this issue, so when I get to it, I'll start a new issue requesting advice...

btsimonh avatar Jan 24 '20 09:01 btsimonh

The issue is that if a seek aims for a time which is < smallgaplimit before the first segment, then the seek does not complete, however small the smallgaplimit - by design the player does not load the required segment, and all other segment data has been discarded (is this true?), so the browser cannot recover?

I was sure this was not accurate, but it seems you're right. With the default config, I can reproduce this easily with our demo content. For example, playing "Angel One", which is short, you can easily see the buffered ranges in the UI. Around time 30, it starts clearing segments from the buffer. At that time, you can run this in the console to reproduce the issue: video.currentTime = video.buffered.start(0) - 0.25 (since the small gap limit is 0.5).

It should also be fairly easy to reproduce this in a unit test.

I'm not sure why this would happen, though. The seek to a time before the buffer start should count as an unbuffered seek, which would cause the buffers to be cleared and new segments to be fetched. But this doesn't appear to happen.

In the failed state, StreamingEngine has a lastSegmentReference in both video and audio in mediaStates_.

So, I could expect this to affect anyone who steps back in a video, and is unlucky enough to click on the right portion of the timeline....

You're absolutely right. It looks as though this could happen to anyone on any content.

joeyparrish avatar Feb 13 '20 01:02 joeyparrish

thanks @joeyparrish for looking again. Seems my abuse of the player has lead to some good :).

btsimonh avatar Feb 13 '20 07:02 btsimonh

Yes, definitely. We appreciate your persistence and patience with us.

joeyparrish avatar Feb 13 '20 16:02 joeyparrish

Note: In the latest code from the master branch, it can be difficult to see when the early parts of the buffer have been cleared. This is caused by a change to our default UI configuration. To revert to the old UI behavior to make this bug easier to reproduce, use:

video.ui.configure('showUnbufferedStart', true);

joeyparrish avatar Apr 07 '20 22:04 joeyparrish

After my refactors for #1339, this issue is still reproducible at the right time with:

video.currentTime = video.buffered.start(0) - 0.25;

joeyparrish avatar Apr 07 '20 22:04 joeyparrish

A few notes for posterity as I debug this:

StallDetector waits for GapJumpingController to tell it to look for a stall. GapJumpingController in this case is waiting on StreamingEngine to append a segment before it does anything. And StreamingEngine is not fetching anything and has no scheduled updates. If I force an update with this:

for (const type of ['audio', 'video']) {
  player.streamingEngine_.scheduleUpdate_(
      player.streamingEngine_.mediaStates_.get(type), 0);
}

Then getTimeNeeded_() returns 60, which is equal to video.duration, so we don't fetch anything. video.buffered.start(0) is 20, and video.currentTime is 19.75.

joeyparrish avatar Apr 07 '20 22:04 joeyparrish

getTimeNeeded_() is returning the end time of the last segment we appended. The code which is supposed to handle this is the seeked() callback, which clears the buffer on an unbuffered seek.

The problem is that seeked() also considers the gap limits. It sees that we are less than smallGapLimit outside of buffered content, so it doesn't clear the buffer.

Experimentally, if I seek 250ms before the buffer start, it won't play through the gap. At 81ms, it won't play through. At 80ms, it will finally play through.

These tests are on ChromeOS 80, with the "Angel One" clip in our demo.

joeyparrish avatar Apr 07 '20 23:04 joeyparrish

If I disable gap-jumping and artificially create a gap at the beginning of playback by changing timestampOffset in StreamingEngine, ChromeOS 80 will play through a gap of up to 999ms in "Angel One".

If I do the same in the middle of the presentation instead of at the beginning, ChromeOS 80 will only play through a gap of up to 20ms. Interestingly, Chrome shows a single buffered range when it is capable of playing through one of these gap between segments.

It seems that the root issue here, though, is that gap-jumping should take care of this, but it doesn't, because it's waiting for StreamingEngine to append something. Meanwhile StreamingEngine does not clear the buffer or append anything new, because it knows that gap-jumping should kick in based on the gap-jumping config. So we need a way to break this circular logic.

joeyparrish avatar Apr 07 '20 23:04 joeyparrish

It's becoming more clear that the way gap-jumping and stall-detection work may not be fundamentally correct with respect to the browser behavior it's trying to compensate for. It seems to work well enough in most cases, but this edge case is making me rethink things.

For now, I'm taking this issue out of the v2.6 milestone so that we can take time to look more closely at the design of these components.

joeyparrish avatar Apr 09 '20 17:04 joeyparrish

:) so glad i did not try to fix it - i would have made it worse! Thankyou so much for believing in my report!

btsimonh avatar Apr 09 '20 17:04 btsimonh

I also facing a similar issue in shaka player.

The version of Shaka Player is 2.5.10 The issue is reproducible with the latest version. The Browser is Chrome and OS is macOS 10.15.6

What I did: Created a custom button to skip 10sec forward and backwards. Play content on shaka player. Forwarded it using Skip(10sec)forward button for 4-5 times. Then moved backwards using Skip(10sec)backward button for 6-7 times. NOTE: Tried the same things using the seek bar. The issue is reproducible there also sometimes. For safari its working fine. Actual Sometimes the content does not load at all to resume further. Expectation The content should play from the resume point.

poolodhi avatar Sep 28 '20 06:09 poolodhi

Issue #3187 seems to be a duplicate. We can refer to it for another example of repro.

joeyparrish avatar Mar 02 '21 21:03 joeyparrish

for info: I'm literally on the verge of using 3.0.8 in production. My config is:

    var config = {
      abr: {
        enabled:false,
      },
      streaming:{
        smallGapLimit:0,
      },
      drm: {
        servers: {
          'com.widevine.alpha': '*** my server ***',
        }
      }
    };

Apart from that it's a pretty vanilla usage pattern. I believe the 'smallGapLimit:0' avoids the behaviour at the expense of other functionality.

I've not seen any serious stalls for the moment - but I have not run additional diagnostics to check or seriously looked at network traffic. As we test, I'll ask people to report seeks which do not achieve required locations. br, simon

btsimonh avatar Mar 03 '21 10:03 btsimonh