bee icon indicating copy to clipboard operation
bee copied to clipboard

Feature Request: HLS-to-Swarm adapter

Open chrishobcroft opened this issue 2 years ago • 16 comments

Summary

HTTP Live Streaming (also known as HLS) is an HTTP-based adaptive bitrate streaming communications protocol developed by Apple Inc. and released in 2009.

This feature request is for a mechanism to automatically store content published from a HLS stream into Swarm.

Motivation

Video is the highest bandwidth means of digital communication.

Furthermore, live streaming video is becoming increasingly available for us all as a way to share content such as these high quality technical talks from EthPrague.

Most of the value being created live is being captured and mined by existing closed-source monopolies.

A feature like this could provide an easy way for publishers of content via HLS livestreams to store their content more sustainably, while mitigating their de-platforming risk.

Implementation

There appear to me to be two options:

  1. A new daemon, which exists outside of Swarm, which requests content from the .m3u8 HLS endpoint, and pushes into bee. Can be written in any language.

  2. A feature in bee, which requests content from the .m3u8 HLS endpoint. Must be in golang.

Recommend using MistServer to provide the .m3u8 endpoint for developing/testing against.

Recommend starting with a very very very low bitrate/frame size/frame rate, then turn it up and see where it breaks.

Drawbacks

It may be true that Swarm isn't able to receive a stream of content, however low the bitrate is.

chrishobcroft avatar Jun 23 '22 20:06 chrishobcroft

cf https://github.com/bee-resource/bee-node/blob/98320f4629c4caaef9f64ec7ff3f130b9a7a2f44/pkg/pushsync/pushsync.go

IxaBrjnko avatar Jun 23 '22 21:06 IxaBrjnko

I think the 1st option (daemon) is the easiest for interoperability. This also enables options for setting up listening via a separate interface, and need only integrate with a pusher.

IxaBrjnko avatar Jun 23 '22 21:06 IxaBrjnko

There are several choices to be made in terms of how to break the chunk space down in relation to the content available from the .m3u8 stream, but for Livepeer the constraints are a 2sec .ts segmentation length and up to a 60fps stable rendering of 1080p H264+AAC video https://livepeer.studio/docs/guides/start-live-streaming/support-matrix

IxaBrjnko avatar Jun 23 '22 21:06 IxaBrjnko

...at 1080p and 8bit color a 4096kb chunk could potentially hold... 2 frames.

IxaBrjnko avatar Jun 23 '22 21:06 IxaBrjnko

...but most frames in most video are mostly the same from frame to frame.

IxaBrjnko avatar Jun 23 '22 21:06 IxaBrjnko

Further motivation

If such a adapter existed, it would result in the archiving of a series of n-second-long (e.g. n=2) video segments, which when concatenated, form the full video. The motivation stems from the concept of a lo-fi "web app video editor", where a user is able to cut a video by merely rearranging or deleting the swarm hashes in the json (for example!)

Another implementation option

Another way to implement:

  1. Develop a way for MistServer to push into a Swarm endpoint. May require to be developed in C.

chrishobcroft avatar Jun 23 '22 21:06 chrishobcroft

merging multiple segments/chunks into a stream would need to interoperate with feeds: https://github.com/ethersphere/bee-docs/blob/master/docs/dapps-on-swarm/feeds.md and bmt-js might also help with some of the overhead carried in shared pixels between frames. Pulling is a bit different from pushing...

IxaBrjnko avatar Jun 23 '22 21:06 IxaBrjnko

  1. Develop a way for MistServer to push into a Swarm endpoint. May require to be developed in C.

I have read the manuals, so far, but not the sourcecode... https://github.com/DDVTECH/mistserver

IxaBrjnko avatar Jun 23 '22 21:06 IxaBrjnko

merging multiple segments/chunks into a stream...

Perhaps we need to be clear on terminology here. I don't believe that segments and chunks can be used interchangeably.

In HLS world, a n-second-long segment of video is represented as a .ts file. This file would then (presumably) be divided into chunks for storage on Swarm.

Does that sound reasonable?

chrishobcroft avatar Jun 24 '22 07:06 chrishobcroft

I think this is a good idea to try, there were some discussions about this long time ago, because the single-owner chunks (SOC) feature seems to be a good match to write a stream of chunks for video content.

I am curious about the feasibility of this, because currently the swarm network is not optimised for real time delivery. However I happened to participate in a hack project earlier (https://swapchat.bzz.link) which is a private chat between two parties, using SOCs with polling, similarly how I imagine the video use-case could work. It uses the public gateway which is rate limited and that causes problems sometimes when loading the page (so you may have to reload a few times), but when it is connected then the messages arrive in ~3-5 seconds. Each message is transmitted as a single chunk. That may be good enough for live-streaming as well, with the added bonus of persisting the chunks so that it can be replayed later as well.

As a test I would go with the first option. I would create a new daemon and try to push data on swarm with an increasing SOC index. Then it would be possible to measure how big the latency is with another custom client application that would poll the highest index. It is important to mention that this client should connect to a different Bee than where the upload happens, because there is caching happening in Bee and that may skew the results.

Also the polling must have some added logic, for example trying to fetch the same index several times with some delay between them and cancel the outstanding requests once one of them succeeds. This is necessary because the current Bee implementation has a logic to return 404 after a few seconds instead of keep looking for a certain chunk. There was some discussions about how this behavior could be changed, my proposal would be to add an endpoint to Bee that is trying to fetch a chunk with an address until a timeout happens to support use-cases like this.

agazso avatar Jun 24 '22 09:06 agazso

Hey all! I'm the lead dev for MistServer, and Chris asked me to chime in a bit here. I'm not very familiar with Swarm at all - so please do correct me if I'm ignorant on anything or making wrong assumptions. 😇

First of all, I'm listing my assumptions here:

  • Swarm is intended here to be the archival method of a (live) stream, not the real-time distribution method.
  • The HTTP retrieval endpoints have support for range requests (do they? I hope so...)
  • Native playback in as many applications/browsers as possible is intended, preferably using existing standards

If those assumptions are all valid, I propose the following solution:

MistServer can be told to pull a HLS endpoint (or almost any other kind of stream endpoint, really), and turn it into a local "stream" on the system. This abstracts away the nitty-gritty details of the ingest format, so from this point onward all HLS-related specifics no longer matter (and any other kind of supported input format would work identically). Then there is support for automatically pushing streams on the system to elsewhere using various methods, including writing a file to disk and several other methods. A "quick and dirty" method would be to simply record the stream to local file, use a RECORDING_END trigger to detect the completion of the write, and then use the usual file upload method to store the file and delete the local copy. This requires no changes to MistServer at all, and only e.g. a small bash script to handle the upload+delete (to be ran by MistServer automatically on recording completion). A nicer method would be to use this endpoint: https://docs.ethswarm.org/api/#tag/Chunk/paths/~1chunks~1stream/get (since MistServer already has a lot of native support for websockets and chunking), and stream a MKV/WebM format recording into Swarm as it's being created. This would take somebody familiar with Mist's codebase maybe 5-10 hours to complete, so it's not a crazy amount of work at all. As an in-between, a simple application could be written to take data over stdin and upload it in the same way. MistServer supports piping an ongoing recording into stdin of any arbitrary shell command.

Regardless of which of the above methods is used, you'd end up with a WebM/MKV format recording that, when requested over HTTP, would play practically anywhere (most non-Apple browsers, pretty much every dedicated media player application).

I'd be happy to assist or help explain where needed - might even write the "nicer method" myself, but I'm not sure exactly when I'd be able to make the time free to actually do so. 😅

Thulinma avatar Jun 24 '22 11:06 Thulinma

I am curious about the feasibility of this, because currently the swarm network is not optimised for real time delivery. However I happened to participate in a hack project earlier (https://swapchat.bzz.link) which is a private chat between two parties, using SOCs with polling, similarly how I imagine the video use-case could work. It uses the public gateway which is rate limited and that causes problems sometimes when loading the page (so you may have to reload a few times), but when it is connected then the messages arrive in ~3-5 seconds. Each message is transmitted as a single chunk. That may be good enough for live-streaming as well, with the added bonus of persisting the chunks so that it can be replayed later as well.

In my work with a 55million reference (some references are multiple chunks) dataset, I've seen individual chunks take 5-10 seconds for retrieval. The internal timeout on the /bytes API endpoint is currently 10 seconds as seen in the following code link. The public gateway is really useless for testing anything that involves moving large amounts of data or expecting quick response. Running your own local node is much better. There'll be a big difference between single chunks for text messages and the size required for live-streaming video. The trickiest part of response time testing is ensuring that the data is coming from the swarm (first time) and not from the node's local cache (every subsequent retrieval). I use verbosity 5 logs and watch the output for "retrieval" messages that indicate that the swarm is being accessed.

https://github.com/ethersphere/bee/blob/0bf1fd7488d973a0752f225a837daafb3c6108fd/pkg/retrieval/retrieval.go#L100

It is important to mention that this client should connect to a different Bee than where the upload happens, because there is caching happening in Bee and that may skew the results.

And for each subsequent test, you either need to db nuke the retrieval bee node or have a bunch of different nodes that you can sequence through. I have not found any way short of db nuke to remove chunks from the local node cache after the first retrieval.

Also the polling must have some added logic, for example trying to fetch the same index several times with some delay between them and cancel the outstanding requests once one of them succeeds. This is necessary because the current Bee implementation has a logic to return 404 after a few seconds instead of keep looking for a certain chunk. There was some discussions about how this behavior could be changed, my proposal would be to add an endpoint to Bee that is trying to fetch a chunk with an address until a timeout happens to support use-cases like this.

As mentioned above, the current timeout is 10 seconds, but it is not uncommon for a failure to happen sooner than that. Also rather surprising to me is that if you hit the /bytes retrieval API repeatedly (/bytes is effectively used under the /bzz covers, but I use the former exclusively in my testing), it will many times succeed after a bunch of retries. By "a bunch", I'm currently using 30 retries on each retrieval. Many work the first time, and I've seen them take up to 29 and 30 and some chunks still fail after that.

I recently updated my retrieval/pinner test to send a PUT /stewardship to my original uploading node on the first /bytes failure and most of my chunks now succeed on the first retry after the /stewardship has "repaired" the swarm. This is all very explicit client-side JS code, but it works really well for maintaining full retrievability of a dataset from the swarm.

BTW, I don't yet know how long it takes to traverse the full 55 million dataset from the swarm, but it will definitely be measured in days, not hours or minutes. It's the dataset behind:

https://bah5acgza7x6rod3tsu54eywzg3j2kmu3pb4yam25ybkhamv3fjrdt27jlj3a.bzz.link/

Which is mainnet swarm collection (manifest) reference:

f09da1184cc9ef6af3b228f72c6ff965fcfee58b47d65d52ef6cf4e5347c766e

I'll be watching this HLS/video effort as it progresses!

ldeffenb avatar Jun 24 '22 12:06 ldeffenb

Perhaps this isn't necessary, but I want to impress on you all the opportunity to try this with a very low data rate of HLS stream.

Like, forget 1080p and 60fps for now, let's start with 72p, or 128x72 pixels, with the lowest video / audio bitrates, and minimal frame-rates, e.g. 4fps.

image

@Thulinma what is the lowest data rate of stream feasible using MistServer? OBS can publish to rtmp endpoint with as little as 50Kbps video and 20Kbps audio, but I wonder whether MistServer could potentially serve up even lighter-weight HLS streams? Y'know, we don't want to hit Swarm with more than it can handle ;)

chrishobcroft avatar Jun 24 '22 16:06 chrishobcroft

Sharing a zip file with some 2-second .ts video segments

chrishobcroft avatar Jun 24 '22 16:06 chrishobcroft

merging multiple segments/chunks into a stream...

Perhaps we need to be clear on terminology here. I don't believe that segments and chunks can be used interchangeably.

In HLS world, a n-second-long segment of video is represented as a .ts file. This file would then (presumably) be divided into chunks for storage on Swarm.

Does that sound reasonable?

No, the last part isn't really what I am proposing, although I do agree on the terms. In any case, either segments or chunks of segments (or segments of chunks, although at realistic display standards those segments would have to be no more than a couple frames) must be brought into a feed to be brought out from swarm in a batch of chunks.

IxaBrjnko avatar Jun 24 '22 16:06 IxaBrjnko

@Thulinma what is the lowest data rate of stream feasible using MistServer? OBS can publish to rtmp endpoint with as little as 50Kbps video and 20Kbps audio, but I wonder whether MistServer could potentially serve up even lighter-weight HLS streams? Y'know, we don't want to hit Swarm with more than it can handle ;)

There is no practical lower limit, but things will start to look pretty bad if you go too low. Would still work just fine, though.

Thulinma avatar Jun 24 '22 22:06 Thulinma

We can reopen the issue when further discussion is needed.

istae avatar May 24 '23 11:05 istae