Fix bandwidth sampling for small transfers and fast connections
This PR will...
Fix bandwidth sampling weight for small transfers and/or fast connections.
Why is this Pull Request needed?
The current logic is broken for fast transfers (due to small transfer sizes, or from a fast connection). See #3563 for details.
Are there any points in the code the reviewer needs to double check?
This changes the exact meaning of the abrEwma<X> options, which could be considered a breaking change.
I have also had to significantly lower the abrEwmaFastLive default to accommodate LL-HLS level downswitches. The old value meant that it takes too long to discover a new lowered bitrate, and could cause multiple successive emergency level switches and stalls. Note that the new value could make a temporary connection issue more likely to cause temporary downswitches. It might make sense to only use this low value for LL-HLS content, but that is outside the scope of this patch.
Resolves issues:
Partly fixes #3563. With this PR, the ABR algorithm is more likely to switch up from a low level (still only when there is sufficient bandwidth). Low latency content on high latency links are still unable to measure a suitable bitrate.
Checklist
- [x] changes have been done against master branch, and PR does not conflict
- [x] new unit / functional tests have been added (whenever applicable)
- [x] API or design changes are documented in API.md
I can't accept it based on how it would impact all streams.
That is the point of this PR. To fix the most egregious issue of #3563.
Did you see the detailed note in the commit, that gives an example of just how broken the current estimator is?
With 5s segments of size 100,000: Start 10x 5 sec transfers => 160Kbps (reference)
Rate change (old): 1x 0.5 sec transfer (1.6Mbps) => 216Kbps vs 1x 0.05 sec transfer (16Mbps) => 222Kbps (10x faster is only 4% more!!)
Rate change (new): 1x 0.5 sec transfer (1.6Mbps) => 627Kbps vs 1x 0.05 sec transfer (16Mbps) => 5.3Mbps
Ie. a sampling with a 10x bandwidth increase can mean just a 1.35x increase over a 5 second interval, while a 100x bandwidth increase only makes it 1.39x !!!
An estimation rework is essential to ever get LL-HLS to work with ABR switching. This PR fixes the fundamental issue, and high-bandwidth estimation issues with the current implementation.
I really hope you will prioritise a fix before 1.0.0.
Note that the new default abrEwmaFastLive value is not essential to the fix, and could be omitted from the PR. Hls.js will need a mechanism to lower it for smooth near-edge LL-HLS ABR playback, though. Maybe the abrEwmaFastLive value could be capped to 1/2 the current time (in seconds) to buffer exhaustion?
I had another look at the estimation, and found that it can be simplified, and work better, if the weight is always a fixed value for each sample.
So my initial patch tried to use the fragment / part duration for the weight, which makes some sense, and certainly works a lot better than the current logic. However, it meant that the halfLife needed to be quite different for LL-HLS vs normal content.
I came to realise that there are essentially 2 modes when estimating, part loading vs. fragment loading, and both needs to adjust for bandwidth changes in sample time. Ie. when close to live edge, both modes have ~2 parts/fragments time to react to a bandwidth change.
Based on this realisation, I changed the abrEwmaFast/Slow values to just represent samples. This means that the same value will work quite nicely for both part and fragment loading, and the implementation can be simplified. I converted from the current slow=3 & fast=9 values using a normalised 6 second fragment duration. Besides improving the estimation responsiveness of part loading, I expect it will also work better for fragment loading for playlists with high/low fragment durations without tweaking abrEwmaFast/Slow.
As part of this revision I also removed the Live/VOD distinction of the config values. I did this since the default values were already the same, and because adjusting the values based on the playlist type is a very simplistic approach. It makes much more sense to adjust the values based on the current buffer level. Both live and VOD playback can have low & high buffer levels due to bandwidth conditions. The current values are tuned to a low buffer level, so it would be prudent to detect a high buffer level and raise the values dynamically, to avoid quality dips from a temporary bandwidth burp. This is probably outside of the scope of the current patch, though.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
This issue has been automatically closed because it has not had recent activity. If this issue is still valid, please ping a maintainer and ask them to label it accordingly.
This PR has been replaced by #4825