PeerTube icon indicating copy to clipboard operation
PeerTube copied to clipboard

IPFS to store videos

Open alxlg opened this issue 8 years ago • 124 comments

I think that what limits PeerTube adoption is that instances are perfect for personal/organization use but not to build a free service like YouTube where everyone can upload videos without limits. The issue is that storage has a cost and videos make the necessary storage grow quickly.

IPFS (InterPlanetary File System) can be used to solve the storage issue because every user can store the files by himself but it doesn't have a way to browse and interact with them. PeerTube instead has an awesome UI that can be used by everyone.

Would it be possible to combine PeerTube and IPFS? Ideally the instance administrator would limit the classic upload for each user but eventually let users upload videos by specify an IPFS address. I guess when a second user browse a PeerTube instance and want to watch a video hosted on IPFS, PeerTube provides it by reading from IPFS and not from its local storage. PeerTube instances would cache IPFS contents like IPFS users and admins would monitor IPFS cache impact on their storage. If a PeerTube user wants to be sure its video is available he just have to keep it on IPFS with his machine. This could have another advantage: if the used PeerTube instance won't be available anymore its users won't need to upload videos on other PeerTube instances if they are on IPFS: they would just "upload" the IPFS addresses.

I will be grateful to those who answer by denying or confirming my assumptions.

alxlg avatar Apr 11 '18 11:04 alxlg

Ideally the instance administrator would limit the classic upload for each user but eventually let users upload videos by specify an IPFS address. I guess when a user […] wants to watch a video hosted on IPFS, PeerTube provides it by reading from IPFS and not from its local storage.

@alxlg indeed that's more or less how I envisioned the potential use of IPFS. But then there's the fact an IPFS endpoint is not a webseed[¹] nor holds versions of different quality. In other words, IPFS would only be a second class citizen feature-wise.

¹: let me extend a bit on that issue. The fact is that we use WebTorrent (BitTorrent/WebRTC) on the client side to watch videos. It provides a handy pool of content seeders and direct browser connection. Watching a video via IPFS would mean to replace entirely that component with an IPFS client in the browser. So it's not just thinking of a different storage/upload mechanism.

If you have any ideas as to how to solve these problems, I'm all ears :)

P.S.: we have also not heard a lot about IPFS performance-wise when it comes to streaming videos.

rigelk avatar Apr 11 '18 12:04 rigelk

@rigelk thanks for your reply!

I had not thought of different quality versions of videos. Since IPFS is really low in the stack the only solution I can think of is storing a different IPFS file for each version. The user should be able to specify the IPFS address for each quality version he wants to maintain... This doesn't seem user-friendly but with a desktop client that automatically manage versions on IPFS it could gain adoption... Ideally the desktop client could use some API to upload the video to PeerTube by specifying many IPFS addresses. Desktop client's users should just pick a video from their HDD and the desktop client would generate different versions, upload them to IPFS and load the addresses to a PeerTube instance.

It seems like a big amount of work but promising to me and the idea could get many contributors.

alxlg avatar Apr 11 '18 13:04 alxlg

@alxlg we're not even close to writing a Desktop client. This is a non-option considered our resources.

I was considering leaving the video uploaded without transcoding, thus leaving a single quality available. It's always better than no video at all.


But now that I come to think of it, about ¹: do we even have to replace the webtorrent client for ipfs videos? If we could manage to mark ipfs endpoints as WebSeed, we could just use them under the hood of webtorrent by making the webtorrent client aware of them.

rigelk avatar Apr 11 '18 13:04 rigelk

@rigelk in fact I did not intend to replace the video player. I thought that the server could run both PeerTube and a IPFS node, and the PeerTube instance see the file cached with IPFS like local files... I hope it has sense now...

This feature doesn't depend on a desktop client, but it would just help normal PeerTube users to automatically store their videos locally.

I would be happy to store some videos with IPFS on my HDD without a PeerTube instance that need much more maintenance.

I think the change would mostly be in PeerTube UI providing a way to upload a video through a IPFS address and not uploading the entire file to the PeerTube instance. Of course the server admin should configure it to run IPFS...

alxlg avatar Apr 11 '18 16:04 alxlg

Would not a webRTC torrent app running in the users PC do the same thing when she/he leave it seeding the video? This also allows user to create "seedboxs" using RSS auto downloading torrent apps. Easy simple/KISS way of "distributing" the video hosting.

https://github.com/Openmedianetwork/visionOntv/wiki/Seedbox

Openmedianetwork avatar Apr 18 '18 07:04 Openmedianetwork

@Openmedianetwork good point! I think your proposal is easier to implement but using IPFS too could have some advantages. For example, I'm pretty sure that if an instance of PeerTube is no longer available an user can reupload his/her IPFS videos on another instance just sharing IPFS addresses, much better than reuploading video files! Do you think this could be achieved with torrents/WebTorrents too? Would changes to PeerTube be needed? Maybe uploading a *.torrent file instead of a video file?

alxlg avatar Apr 18 '18 15:04 alxlg

@alxlg using an IPFS address or a *.torrent file yields the same import capabilities. See #102. The only advantage of IPFS I see is that there are pinning brokers. (For BitTorrent too? I didn't check)

rigelk avatar Apr 18 '18 15:04 rigelk

issues with testing webrtc http://hamishcampbell.com/index.php/2018/05/18/testing-peertube-webrtc-seedboxes/

Openmedianetwork avatar May 25 '18 22:05 Openmedianetwork

@Openmedianetwork this has nothing to do with IPFS. Please find a related issue and detail your problem there, not in a blog post.

rigelk avatar May 25 '18 23:05 rigelk

Sorry this was a update for alexig that my suggestion for webRTC torrent as an alternative for seeding dues not appear to work after actually testing. Will start a new thread after further of tests.

Openmedianetwork avatar May 26 '18 07:05 Openmedianetwork

Closing this issue since we have no really use cases for IPFS for now.

Chocobozzz avatar May 29 '18 12:05 Chocobozzz

@rigelk @Chocobozzz IPFS can serve as a backup for local storage and dedicated seeding pools, as it effectively transfers each newly added file to an entire pre-existing network of 300+ peers from the get-go. In such an instance, PeerTube might not necessarily even have to double as a WebRTC-based IPFS node but simply run along side a regular one (which in-itself can be optional), first to ensure the files initially propagate through the network and secondly to provide an optional method for retrieving them through a local gateway. However simply linking to an IPFS address should suffice if it would be possible to configure a PeerTube instance to use external public gateways for retrieval.

In a case where a PeerTube instance should go down with no pre-existing seeding pools in place, as long as the videos are still present on IPFS, it should be possible to retrieve them by simply following each video's address (that presumably was shared beforehand with other federated instances). This way each video will remain accessible and therefor could be later conventionally reseeded via a different instance.

As a by-product, if it'll be possible to authenticate each user's identity, perhaps it might also be possible to use this method for transferring channels between different PeerTube instances.

NightA avatar Jun 27 '18 22:06 NightA

@Chocobozzz shouldn't this be reopened? It seems what @NightA mentioned was a pretty good idea.

poperigby avatar Sep 26 '19 18:09 poperigby

I'm actually interested in implementing this, but I think a roadmap should be discussed.

What I propose:

Phase 1: Server uploads to IPFS and stores hashes of videos

This phase has the potential of requiring double the disk space since the files will be stored on disk normally, then uploaded to IPFS and pinned. In order to prevent that, ipfs-fuse could be used. That will allow mounting ipfs to a directory and designating all videos to be stored there / storing them in one place : IPFS.

I assume there's a json or a table in the db with the videos, where a field or column for the hashes of the video files can be added.

Phase 2: Import from IPFS

In this phase, the user will have the option to provide a hash that the server can download and process. Maybe it's possible to check the filetype before downloading it in order to save bandwidth, I dunno.

If phase 1 is done intelligently, hashes already present in the db will be rejected e.g fil

If phase 1 is done intelligently, hashes of videos already in the db can be rejected since that would create an unnecessary duplicate. But if another server uses the same hash with different properties (title, description) that might not be good. Up for discussion

Phase 3: Syncing with other instances and downloading hashes

This is separated from Phase 2 only if the code is in a separate area. Importing from IPFS might be a different procedure from receiving a video file another instance.
Since I don't know what data is sent over activity pub, I assume it's either the torrent or a link for a server-to-server API call, which include information about the video e.g title, description, resolutions and (of course) hashes.

Phase 4: (wishful thinking) ipfs:// links for videos

Users running their own IPFS nodes with IPFS companion could then stream using IPFS.
I haven't actually done a lot of research into this, so I don't know if it's possible. Maybe plugin for videojs would be necessary - I dunno.

End goal

Instances running IPFS nodes and using that to download and pin hashes, which would allow :

  • greater resilience to takedowns or simple storage failure
  • additional data sources since users could host the data too
  • possibly less bandwidth consumption if Phase 4 actually is possible and is done

LoveIsGrief avatar Sep 26 '19 19:09 LoveIsGrief

Server uploads to IPFS and stores hashes of videos

This phase has the potential of requiring double the disk space since the files will be stored on disk normally, then uploaded to IPFS and pinned. In order to prevent that, ipfs-fuse could be used. That will allow mounting ipfs to a directory and designating all videos to be stored there / storing them in one place : IPFS.

Where is it uploaded exactly? A third party (a pinning service?)?

ipfs-fuse requires a the Go IPFS runtime alongside, so this will complexify the deployment.

Syncing with other instances and downloading hashes

This is trivially done by adding another Link object in the ActivityPub Video.url field.

Users running their own IPFS nodes with IPFS companion could then stream using IPFS.

What bothers me is that users are expected to have this extension and a running IPFS Go runtime.

I haven't actually done a lot of research into this, so I don't know if it's possible. Maybe plugin for videojs would be necessary - I dunno.

I haven't found any videojs plugin for ifps, or its companion extension.

Instances running IPFS nodes and using that to download and pin hashes, which would allow :

* greater resilience to takedowns or simple storage failure

* additional data sources since users could host the data too

* possibly less bandwidth consumption if Phase 4 actually is possible and is done
  • resilience can be achieved in IMHO simpler ways. Right now what makes a video's origin instance disappearance a problem is that the BitTorrent tracker is the instance. Sharing the video over DHT alongside would already solve the problem, without changing our infrastructure.
  • users can already host the data, with a webtorrent-compatible torrent client.
  • less bandwidth consumption is already achieved with WebTorrent: people watching share their bandwidth (and not just those with an extension)
  • less bandwidth consumption is already achieved with WebSeeds/replication, with less impact to the buffering speed.

rigelk avatar Sep 27 '19 13:09 rigelk

Where is it uploaded exactly? A third party (a pinning service?)?

It's pinned locally and uploaded when somebody else requests it over IPFS. If someone does so over a third party, then they'll have it, but not pinned.

ipfs-fuse requires a the Go IPFS runtime alongside, so this will complexify the deployment.

That depends. It's very possible to do that in stages too:

  • stage 1: tell admin that IPFS has to be installed
  • stage 2: provide config interface for admin to target a IPFS node of choice (IP:port of IPFS HTTP API used by ipfs-fuse)
  • stage 3: provide option to install IPFS for the admin

What bothers me is that users are expected to have this extension and a running IPFS Go runtime.

This is not a proposal to force users to use IPFS, merely to give users the option. Right now they have the opportunity to use HTTP or webtorrent. This would merely be another one

I haven't found any videojs plugin for ifps, or its companion extension.

Yes, that would have to be developed if necessary. The companion provides the useful feature of redirecting URLs to IPFS instances :

Requests for IPFS-like paths (/ipfs/{cid} or /ipns/{peerid_or_host-with-dnslink}) >>are detected on any website.
If tested path is a valid IPFS address it gets redirected and loaded from a local gateway, e.g:

https://ipfs.io/ipfs/QmbWqxBEKC3P8tqsKc98xmWNzrzDtRLMiMPL8wBuTGsMnR
http://127.0.0.1:8080/ipfs/QmbWqxBEKC3P8tqsKc98xmWNzrzDtRLMiMPL8wBuTGsMnR

I assume we use HLS for streaming, so serving a different .m3u8 playlist with /ipfs links would be the only work required.


Your final points are valid, but I don't see the harm in providing an additional option. In case you missed it, I am willing to implement this, so your workload would be reduced to code-reviews and handling future bugs (since no code is perfect).

Of course, if you are firmly against having it in the original code base, I'll investigate if a plugin can be written and if not, simply fork it.

LoveIsGrief avatar Sep 27 '19 14:09 LoveIsGrief

This is not a proposal to force users to use IPFS, merely to give users the option. Right now they have the opportunity to use HTTP or webtorrent. This would merely be another one more.

Your final points are valid, but I don't see the harm in providing an additional option.

:+1:

In case you missed it, I am willing to implement this, so your workload would be reduced to code-reviews and handling future bugs (since no code is perfect).

I am not sure we could handle the future bugs part, especially when dealing with technologies we don't use regularly. And since merging code means taking responsibility in its maintainance…

Of course, if you are firmly against having it in the original code base, I'll investigate if a plugin can be written and if not, simply fork it.

I would suggest waiting for @Chocobozzz to answer about that - regarding plugins, not everything can be changed via their API. Depending on how many and where changes to the codebase are required, the plugin API could be expanded to facilitate your changes.

Now that being said, implementing it directly in the codebase at first is not a bad idea. It is not time lost, as this will serve as a POC and help us understand the reach of the needed changes - and thus the potential changes to the plugin API.

rigelk avatar Sep 27 '19 21:09 rigelk

Where is it uploaded exactly? A third party (a pinning service?)?

To add a bit more details to what @LoveIsGrief mentioned, when a file is added to IPFS it is given a UnixFX structure, cryptographically signed, hashed, registered as a content-identifier (CID) in an IPLD, broken into sets of Merkle-DAG blocks and then distributed peer-to-peer using a DHT.

* resilience can be achieved in IMHO simpler ways. Right now what makes a video's origin instance disappearance a problem is that the BitTorrent tracker is the instance. Sharing the video over DHT alongside would already solve the problem, without changing our infrastructure.

From what i understood, unlike the DHTs utilized with torrents, the IPFS network doesn't focus on seeding each file individually rather focuses on distributing the individual file-blocks themselves among many peers. In which case, a typical IPFS nodes doesn't really "seed" individual files, rather only temporarily caches blocks of various others and only stores complete sets of blocks for specific files when those are explicitly pinned. Otherwise, the cache gets garbage collected and erased after a specific time period that's configured on each individual node.

So unlike with a PeerTube instances that goes under with a specific torrent which happened to be seeded from one specific location (the instance itself), once the same file gets cached throughout enough IPFS nodes, it has this sort-of a grace-period for being retrieved. During that period it has the chance of being saved/pinned on another IPFS node or be imported unto another hypothetical PeerTube instance that supports retrieval from IPFS.

The IPFS node in this regard also doesn't have to have anything to do with PeerTube instances to begin with, as it simply provides the files as long as there's someone requesting them.

TL;DR - Torrent DHT's only replace a tracker and in that regard only point to files that may or may not be seeded anymore. IPFS provides a P2P CDN of sorts that can cache those files independently of their initial seeding PeerTube instance, thus preserves them over a pool that operates independently and is not restricted to any particular instance and/or file.

* users can already host the data, with a webtorrent-compatible torrent client.

In practice users who just consume content tend not to do so, unlike IPFS nodes who do so from the get-go upon request. E.g, while the initial "seed" has to come from an IPFS node running along with a PeerTube instance, once the file had propagated through multiple requests, it can be retrieved from other non-associated nodes within a given time-span.

That being said, yes, to ensure the file doesn't disappear from the network there has to be an IPFS node somewhere that pinns some of the content from the aforementioned PeerTube instance, which is a concept that sounds similar to a basic seedbox. However considering the pre-existing given network of peers that can automatically participate, this gives more chances in terms of availability for those files.

* less bandwidth consumption is already achieved with WebTorrent: people watching share their bandwidth (and not just those with an extension)

With IPFS there are also dedicated nodes that run on Independent servers/VPS's who distribute the content, in addition to users who just seed some of it for the duration of its run and then move along.

So essentially if a video can be streamed from each public IPFS gateway within a pre-defined list, all the PeerTube instance has to do to offload traffic is just pick and point to one of those IPFS "edge" gateways and serve the page to the user as usual.

NightA avatar Sep 28 '19 14:09 NightA

This phase has the potential of requiring double the disk space since the files will be stored on disk normally, then uploaded to IPFS and pinned. In order to prevent that, ipfs-fuse could be used.

You can just use --nocopy option on add. And --raw-leaves if you want to store it only in ipfs and have same hash as with --nocopy.

ipfs add --nocopy file_name
ipfs add --nocopy -r directory_name

Additionally if you use --chunker=rabin different files will share same parts in it.

ipfs add --nocopy --chunker=rabin file_name
ipfs add --nocopy -r --chunker=rabin directory_name

In torrent file can be added local webseed http://127.0.0.1:8080/ipfs/{cid(hash) of file or directory}. Torrent clients will use it if it will be available.

Webtorrent can fetch public gateways and replase ":hash" with {cid(hash) of file or directory} from local webseed link. And use that gateways as alternative webseeds.

You can add a comment to torrent file with options that used when file added in ipfs. And user than can re add it to ipfs with this options and get same root hash.

ivan386 avatar Sep 29 '19 10:09 ivan386

Any updates on this issue? IPFS is very important because it's an efficient tool against censorship.

ghost avatar Dec 18 '19 14:12 ghost

I don't think IPFS support can be added to Peertube that easy. My suggestion would be to create a different client-side intended to use IPFS network-based backend. Adding IPFS to Peertube as it would add several compatibility issues and increase code complexity. Also may be a PoC about IPFS video streaming with benchmarking should be done first?

sundowndev avatar Feb 04 '20 13:02 sundowndev

@sundowndev Simple player: https://github.com/ivan386/ipfs-online-player Example of how it works: https://gateway.ipfs.io/ipfs/QmX8DUfyL7sVSa61Hwvx4qiTHPcpWrNxvb3XSm5iiqH8Ph/#/ipfs/QmdtE78NHJGByBpoPREMQA142oj9hFPmQRxMniDsbdhw5d

ivan386 avatar Feb 04 '20 15:02 ivan386

Can someone explain to me why it makes sense to store videos on IPFS beyond being able to upload/publish them by IPFS address? You're backing a distributed video storage system with another distributed storage system. IPFS works great with small files that people are likely to rehost within some other system, but in PeerTube it just duplicates the necessity of clients sharing video over webtorrent to other peers. It could help simplify the process of sharing videos with other instances but webtorrent already does that. There's some sense to using a Sia host as a backend for storing data within an instance, as an alternative to just the filesystem or S3, but there isn't much advantage (at least not enough to justify the increased complexity) for something as involed as IPFS.

delbonis avatar Jun 25 '20 04:06 delbonis

At now time all public IPFS gateways can be auto webseeds for webtorrent. Public Gateways will be cache proxy between IPFS peers and Webtorrent peers.

IPFS is better than webtorrent. You can rename file, chage it container and cut part of video without reencoding and IPFS will share unchanged chunks with original file.

Example: I changed container from mp4 to mkv.

ffmpeg -i video.mp4 -c copy video.mkv

And added both files to IPFS with chunker rabin

ipfs add --nocopy --chunker=rabin-1024-2048-4096 video.mp4
ipfs add --nocopy --chunker=rabin-1024-2048-4096 video.mkv

And they have 95% same blocks. And will be 95% source for each other. Each file size is 1.3 GB.

ivan386 avatar Jun 25 '20 12:06 ivan386

@delbonis you can't seed WebTorrent without a Web browser and it's impossible to do so with PeerTube i.e. you can't take a list of PeerTube URLs, download the videos and tell to your PC to seed them for other PeerTube users.

Recently after years of work libtorrent added WebTorrent support so maybe we will see torrent clients supporting it too. Then PeerTube will need some kind of "seed this video from you torrent client of choice" button.

alxlg avatar Jun 25 '20 12:06 alxlg

There's some sense to using a Sia host as a backend for storing data within an instance, as an alternative to just the filesystem or S3, but there isn't much advantage (at least not enough to justify the increased complexity) for something as involed as IPFS.

In that perspective, Tardigrade, a decentralized cloud storage platform, could also be a good option to serve as an S3-like storage bucket. Not the same as IPFS though, but still interesting, maybe to implement via a plugin

I'm just mentioning it so that you know this decentralized cloud platform exists and might be interesting 🙂

thomas-kuntz avatar Jul 05 '20 16:07 thomas-kuntz

Hi, you can upload from browser directly with js-ipfs, we have a PoC in ipfs-upload

Regards :up:

manalejandro avatar Aug 03 '20 19:08 manalejandro

I've been experimenting with this, here's a player that kinda works on PeerTube's existing HLS files after importing them into ipfs with --nocopy. This requires a lot of extra CPU to do the hashing on import but it doesn't use much extra disk space.

https://github.com/scanlime/hls-ipfs-player

ghost avatar Sep 04 '20 10:09 ghost

For what it's worth there is a ton of hype around IPFS and a ton of use cases that make no sense, so I should explain what brought me here:

I was interested in having a more unified data model between the "peer" part of peertube and the actual video storage, and for both of those parts to just generally work a lot better.

Drilling down into the technical details, a big decision is how the data gets hashed. In the current system, we use either WebTorrent (hash everything once the content is fully uploaded) or the novage p2p loader, which is very YOLO and trusts everyone. By contrast, IPFS hashes data in blocks that default to 256kB, max 1MB, which among other things lets processing and replication start before a video upload or stream is finished.

I was investigating architectures that let us keep a consistent data model between different parts of the network (clients, small servers, clustered servers, desktop apps, various caches) and IPFS came up. It's interesting because it solves the P2P problems we were solving on the client in broadly similar but specifically more competent ways.

ghost avatar Sep 04 '20 10:09 ghost

I've been experimenting with this, here's a player that kinda works on PeerTube's existing HLS files after importing them into ipfs with --nocopy. This requires a lot of extra CPU to do the hashing on import but it doesn't use much extra disk space.

https://github.com/scanlime/hls-ipfs-player

Hi i was testing this https://github.com/moshisushi/hlsjs-ipfs-loader and made one hybrid version to do with p2p in the client too, that could be interesting to share two p2p platforms one in IPFS and another in the client, but this doesnt work yet i need some IPFS pending implementations in JS client version, best regards :ok_hand: hlsjs-ipns-p2p-loader.js.txt

manalejandro avatar Sep 04 '20 11:09 manalejandro

I'm not sure what you mean by "two p2p platforms" here? If you mean retaining compatibility with webtorrent or with the novage loader, that may be doable but I'm not sure the existing systems are working well enough that it's useful to retain compatibility with them.

I'm using that same moshisushi/hlsjs-ipfs-loader project but I forked it to fix some bugs, that's probably what you're running into.

scanlime avatar Sep 05 '20 01:09 scanlime

Another note on storage overheads: I have about 4.5 TB loaded in so far, and the additional disk usage is only about 8GB. But the memory usage has been consistently fairly high, about 8GB on this system. That's mostly for the DHT i suspect.

One idea that's been kicking around in my head is to build a media server (possibly based on this, https://gitlab.com/valeth/javelin) that can take live streams in/out as well as handling long term storage. One way to do the storage may be to have a very lightweight built-in IPFS server which would manage the hash database and communicating content with other servers as well as with clients that have native ipfs support. Clients would still get web video from the media server side of that daemon over whatever formats they need.

I'm still interested in finding a storage strategy for this which can unify the two formats we currently use (original webtorrent and HLS fmp4) as well as providing a path to introduce future formats without compounding the storage overhead further. The DAG data structure of IPFS may make it possible to do some kinds of container conversions in a way that references chunks of the original data rather than copying it? That's something I still need to experiment with.

scanlime avatar Sep 05 '20 01:09 scanlime

And added both files to IPFS with chunker rabin

ipfs add --nocopy --chunker=rabin-1024-2048-4096 video.mp4
ipfs add --nocopy --chunker=rabin-1024-2048-4096 video.mkv

And they have 95% same blocks. And will be 95% source for each other. Each file size is 1.3 GB.

That sounds very impressive! Have you calculated the overall space savings though? That's such a small block size that I'd expect the parent nodes that point to all the chunks to themselves require a huge amount of space. Is it enough to make up for avoiding duplicating the video content?

I'd expect that, best case, rabin identifies all the best block boundaries and separates the audio/video streams from the container perfectly at each packet, now you have to store as metadata a reference tree with several hashes per packet.

I'd like to explore this still maybe, but I'm also interested in a way to teach some kind of ipfs-compatible-ish server, possibly either the real go-ipfs, or a light ipfs implementation that's part of a media server, to do remuxing or real-time transcoding to recreate blocks that don't exist. I'm wondering if this could work internally with a mechanism similar to how --nocopy is implemented, where the database stores pointers to filesystem data. In this case the database would store enough information to quickly remux or transcode a section of the input file. This wouldn't prevent having to do the transcode or remux once at first though, in order to determine what the hashes are. It would just be a way to garbage-collect resolutions that aren't used much and then regenerate them quickly as they're needed again.

scanlime avatar Sep 05 '20 04:09 scanlime

FYI I've started a whole new project backed by IPFS but with different design and goals than PeerTube. Also it targets not only video but also audio content. It's written in Go and depend on FFmpeg. Feel free to give feedback about the design! (although this is still a WIP for now)

sundowndev avatar Sep 05 '20 10:09 sundowndev

I'm not sure what you mean by "two p2p platforms" here? If you mean retaining compatibility with webtorrent or with the novage loader, that may be doable but I'm not sure the existing systems are working well enough that it's useful to retain compatibility with them.

I'm using that same moshisushi/hlsjs-ipfs-loader project but I forked it to fix some bugs, that's probably what you're running into.

Because you have two p2p networks one behind IPFS it is like work internally, and another in the client level using p2p software to share the content, the plugin i have uploaded could connect two worlds, you are watching a video over IPFS and sharing the content, the advantage of IPFS vs WT is that torrents must be closed and IPFS can be opened to share live content for example. Regards

manalejandro avatar Sep 05 '20 10:09 manalejandro

And they have 95% same blocks. And will be 95% source for each other. Each file size is 1.3 GB.

That sounds very impressive! Have you calculated the overall space savings though? That's such a small block size that I'd expect the parent nodes that point to all the chunks to themselves require a huge amount of space. Is it enough to make up for avoiding duplicating the video content?

@scanlime

My mistake. Files sizes is 1.6 GB and 76% of same blocks. 1.8% is metadata.

ipfs add --nocopy --chunker=rabin-1024-2048-4096 video.mp4

Result:

Blocks size:     1 655 040 126 bytes -
File size:       1 624 999 455 bytes =
Metadata size:      30 040 671 bytes (1.8%)
ipfs add --nocopy --chunker=rabin-1024-2048-4096 video.mkv

Result:

Blocks size:     1 654 562 731 bytes -
File size:       1 624 528 837 bytes =
Metadata size:      30 033 894 bytes (1.8%)

I use this script (js-ipfs-same) to compare blocks tree.

Same data: 1 246 038 243 bytes (76%)

But for converting container from MP4 to HLS, the rabin will not work.

ivan386 avatar Sep 05 '20 12:09 ivan386

FYI I've started a whole new project backed by IPFS but with different design and goals than PeerTube. Also it targets not only video but also audio content. It's written in Go and depend on FFmpeg. Feel free to give feedback about the design! (although this is still a WIP for now)

That's interesting, seems like a niche that many folks would appreciate being filled! One question though, you are striving for privacy, but IPFS is really antithetical to that goal by default as it's constantly broadcasting information about what content your node has locally. This seems fine for a public site like most PeerTube instances are, but it might be less appropriate for your use case here.

scanlime avatar Sep 05 '20 20:09 scanlime

I'm not sure what you mean by "two p2p platforms" here? If you mean retaining compatibility with webtorrent or with the novage loader, that may be doable but I'm not sure the existing systems are working well enough that it's useful to retain compatibility with them. I'm using that same moshisushi/hlsjs-ipfs-loader project but I forked it to fix some bugs, that's probably what you're running into.

Because you have two p2p networks one behind IPFS it is like work internally, and another in the client level using p2p software to share the content, the plugin i have uploaded could connect two worlds, you are watching a video over IPFS and sharing the content, the advantage of IPFS vs WT is that torrents must be closed and IPFS can be opened to share live content for example. Regards

Ok. I think I get what you're saying. To me the interesting part of using IPFS for this is that you technically would not have two different networks, you would have one network (with different tuning and capabilities for sure) but with the same data model on each side.

I think there's a lot of architectural fuzziness to get lost in here but I find it helpful to think concretely about how the data is being identified and authenticated. With IPFS there's a consistent way to identify and check data across the network. With PeerTube as it stands now, we use bittorrent hashes in part of the system and no hashing at all elsewhere.

Also keep in mind that browsers have a very limited ability to help share video data to others. It might be useful for networks where many clients are sharing data with each other locally through a limited internet connection. But for most PeerTube-like setups I'd expect that the benefit of having better p2p file sharing is in making it easier for loose federations of server operators to mirror each other's data in a way that increases reliability rather than decreasing it :)

scanlime avatar Sep 05 '20 20:09 scanlime

In the current system, we use either WebTorrent (hash everything once the content is fully uploaded) or the novage p2p loader, which is very YOLO and trusts everyone.

My mistake, I did not notice that we actually have another hashing system that's used by the HLS P2P loader. That one uses SHA256 hashes computed on entire segments. So, not really compatible with either other system directly, but it wouldn't be a big deal to retain compatibility with the system.

My main issue with the novage p2p loader is that I haven't come across very many situations where it's actually useful. It might help with individual extremely popular videos, but does anyone have those right now? Does it serve blocks fast enough for clients to see any benefit?

PeerTube's redundancy system is almost great... it only works with trusted peers right now though. A single slow or malicious redundancy peer will ruin the client's experience. That's the problem I'm interested in potentially solving with IPFS: having a good way to open the floodgates to crowdsourced data redundancy in a way that actually works. Browsers are really bad at this, they might be able to help a bit but the core of the network really needs to be servers that link together if we will have p2p that fits users expectations for loading video.

scanlime avatar Sep 06 '20 06:09 scanlime

@scanlime

But for most PeerTube-like setups I'd expect that the benefit of having better p2p file sharing is in making it easier for loose federations of server operators to mirror each other's data in a way that increases reliability rather than decreasing it :)

This is exactly what I mean when opening the issue, thank you and all the others for the very interesting replies so far!

alxlg avatar Sep 06 '20 08:09 alxlg

FYI I've started a whole new project backed by IPFS but with different design and goals than PeerTube. Also it targets not only video but also audio content. It's written in Go and depend on FFmpeg. Feel free to give feedback about the design! (although this is still a WIP for now)

That's interesting, seems like a niche that many folks would appreciate being filled! One question though, you are striving for privacy, but IPFS is really antithetical to that goal by default as it's constantly broadcasting information about what content your node has locally. This seems fine for a public site like most PeerTube instances are, but it might be less appropriate for your use case here.

@scanlime What I mean by privacy is analytic and user tracking. IPFS is just a storage, so if you want to keep your storage private, you can create your own IPFS Swarm with a set of peers. From what I've understood, you don't even have to isolate the pinset behind a private network, you just have to configure a trusted peers list. You can then create a gateway to control access to the cluster content. Is that answering your question ? I'm new to IPFS by the way but from what I've read in the docs, IPFS works fine for both cases.

sundowndev avatar Sep 06 '20 15:09 sundowndev

@scanlime What I mean by privacy is analytic and user tracking. IPFS is just a storage, so if you want to keep your storage private, you can create your own IPFS Swarm with a set of peers. From what I've understood, you don't even have to isolate the pinset behind a private network, you just have to configure a trusted peers list. You can then create a gateway to control access to the cluster content. Is that answering your question ? I'm new to IPFS by the way but from what I've read in the docs, IPFS works fine for both cases.

I think you're misunderstanding what ipfs-cluster is for.. it doesn't change who the individual IPFS nodes contact, it's an additional layer on top of ipfs so that groups of servers can agree to reliably serve a set of content together. That's all about reliability, not about privacy.

You have to understand that IPFS (and all p2p protocols I know about) are quite promiscuous, they will just ask anyone they're connected to for the data they need, or broadcast the data they have to random people. To retain privacy, you would really need to have a completely closed network, in which case there isn't much point to the protocol.

This seems fine for PeerTube and is roughly equivalent to our existng level of privacy here. But I wouldn't use IPFS as a basis for creating a project that has privacy as its focus.

scanlime avatar Sep 06 '20 18:09 scanlime

@scanlime Alright thanks for taking the time to explain that to me. I'll look closer at the design of IPFS and its use-cases. On a side note, what I can do, is simply put IPFS as a storage option in my project. You could then choose between local storage, IPFS or whatever (S3, SFTP...).

I should stop discuss about that there since it's off-topic, but feel free to create an issue in dreamvo/gilfoyle to discuss further about it :)

sundowndev avatar Sep 07 '20 08:09 sundowndev

I'm still experimenting with IPFS. There's a lot to like, but the RAM usage is a real drag, and might be the main reason not to use it in most PeerTube-style video sites. Each IPFS node needs enough memory to quickly invert hashes into disk blocks, and... on my ~9TB dataset the ipfs daemon wants to use around 11GB of RAM even with few peers. The actual hashes are large but i'm not sure how to explain the bulk of that usage. A 32 byte hash for each 256kB block is 128MB per terabyte. (Edited: math was off by 1024x, oops)

scanlime avatar Sep 07 '20 14:09 scanlime

FYI I've started a whole new project backed by IPFS but with different design and goals than PeerTube. Also it targets not only video but also audio content. It's written in Go and depend on FFmpeg. Feel free to give feedback about the design! (although this is still a WIP for now)

Will it somehow relate to peertube? Maybe federate with peertube?

ghost avatar Sep 17 '20 06:09 ghost

FYI I've started a whole new project backed by IPFS but with different design and goals than PeerTube. Also it targets not only video but also audio content. It's written in Go and depend on FFmpeg. Feel free to give feedback about the design! (although this is still a WIP for now)

Will it somehow relate to peertube? Maybe federate with peertube?

I don't think so as it's not designed to work with Peertube or other P2P platforms. But feel free to discuss about the design in the repository.

sundowndev avatar Sep 17 '20 09:09 sundowndev

It's not a real project yet but I've still been experimenting with an IPFS-backed video server that would be compatible with PeerTube federation. There are a lot of performance hurdles to get over but the large amount of existing tooling for sharing hash-addressed blocks still feels compelling to me.

scanlime avatar Sep 17 '20 18:09 scanlime

I think you're misunderstanding what ipfs-cluster is for.. it doesn't change who the individual IPFS nodes contact, it's an additional layer on top of ipfs so that groups of servers can agree to reliably serve a set of content together. That's all about reliability, not about privacy.

@scanlime I've seen this repository showing how to setup a private IPFS swarm to host private files. Apparently this is possible by removing the default bootstrap nodes and isolating your peers inside a private network. See also https://github.com/ipfs-inactive/faq/issues/4 where the creator of IPFS himself says that this is part of their goals. So I guess choosing IPFS for privacy usage is possible and not an anti-feature.

sundowndev avatar Sep 24 '20 14:09 sundowndev

See also ipfs-inactive/faq#4 where the creator of IPFS himself says that this is part of their goals. So I guess choosing IPFS for privacy usage is possible and not an anti-feature.

That is 5 years old, there is a much better way now, you can create private IPFS networks by restricting the p2p connection with a pre-shared key. This is only really practical for situations where you are using ipfs in a data center or in a corporate environment, I don't see how that would apply to something like PeerTube intended to be an open network.

If you are using ipfs and you care about privacy you have to understand it at least. Everyone your node connects with can easily find out what content you have and what content you're looking for. Like most P2P systems.

scanlime avatar Sep 24 '20 16:09 scanlime

Everyone your node connects with can easily find out what content you have and what content you're looking for. Like most P2P systems.

Yes, that's the point of a swarm, this shouldn't be a problem if you control each nodes. But from what I see, using IPFS with a private network is an experimental feature that's not yet supported but which is not against the IPFS's design & goals. But you're right, this is not a feature that PeerTube would need.

sundowndev avatar Sep 24 '20 17:09 sundowndev

This is a bit off topic but let me say that the Internet is not anonymous on a lower level of the stack and there is no way in general to share data publicly on a P2P network and don't make other nodes aware of what other data you are sharing. When we say PeerTube is privacy-friendly we mean privacy, not being anonymous. This means you are not tracked across the Web and nobody is using your browsing history on PeerTube to profile you. This is about the relation between the user and the tech giants providing Web services.

The problem of being anonymous is totally about how the Internet infrastructure is managed in the nation you live in. If you don't trust private ISPs you should politically pretend a public Internet service. If you don't trust your governament, you need to change it. No technology is going to fix that.

alxlg avatar Sep 24 '20 17:09 alxlg

If you don't trust your governament, you need to change it.

This is true, but I'll also add that this is the whole point of technical privacy systems like Tor.

scanlime avatar Sep 25 '20 19:09 scanlime

This whole thread seems to be missing a demo, so here is one, it is complete with P2P, loads extremely quickly, and automatically switches between 5 different bit rates:

https://bafybeiazt45dboknwnwwrekot7eenfr62sr6vmxhrwobr4p3cymfmorx5y.ipfs.dweb.link/

You can get the peers by running this in the console:

for await (const peer of await node.swarm.peers()) { console.log(peer.addr.toString()) }

georgyo avatar Oct 30 '20 12:10 georgyo

Actually there's even another distributed solution which is SIA Skynet. If you look as this example: https://siasky.net/EAC6AsZovYp4aIN-FLj1mFEi43WSrGtF7IBZU1T8BzCGfg it loads faster than 90% of Peertube instances....(the video in the link is the documentary "The Internet's Own Boy: The Story of Aaron Swartz" licensed under Creative Commons "Attribution-NonCommercial-ShareAlike 4.0 International" ( http://creativecommons.org/licenses/by-nc-sa/4.0/ ) )

nukelr avatar Nov 12 '20 16:11 nukelr

There a lot of distributed solutions popping up right now, however only IPFS is not tied to a block chain. You could argue filecoin, but IPFS does not depend on filecoin in any way. This gives value to IPFS for all sorts of applications,

Sia requires the use a token to download or upload files. Sia's slash page hides this, and the link you sent is going though their public gateway that they are paying for. 100% of the traffic is going though their webservers. These web portals will need to make money some how eventually as bandwidth isn't free. Similar to Sia, the people who own the bittorrent trademark released btfs. However it is also just a mechanism to give value to some coin. It is impossible to use these networks directly without jumping though many hoops.

IPFS, like bittorrent (the protocol), is much more true P2P. The peertube instances could run IPFS nodes without having to figure out how to get coins in and out of it.

Tying PeerTube to a storage token, or any specific coin for that matter, would be a bad move IMHO.

georgyo avatar Nov 12 '20 17:11 georgyo

It's not about tying PeerTube to anything, I think that having more options to choose between is always a good idea. You want to use IPFS? Amazon AWS? SIA? should be your choice cuz at the end someone has to pay for bandwith and/or storage so i don't see the point.

nukelr avatar Nov 13 '20 09:11 nukelr

Perhaps i'm missing something obivous, but if the uploader goes offline for some reason wouldn't the video be inaccessible (sorry if this was already replied)?

alex9099 avatar Dec 04 '20 17:12 alex9099

should be your choice cuz at the end someone has to pay for bandwith and/or storage so i don't see the point.

@nukelr Yeah the instance operator should be able to configure storage backends however they want.

Perhaps i'm missing something obivous, but if the uploader goes offline for some reason wouldn't the video be inaccessible (sorry if this was already replied)?

@alex9099 If PeerTube blindly ripped out WebTorrents and replaced it with just the browser js IPFS distribution and then didn't host it locally, maybe. But that's a ridiculous idea.

delbonis avatar Dec 04 '20 17:12 delbonis

Perhaps i'm missing something obivous, but if the uploader goes offline for some reason wouldn't the video be inaccessible (sorry if this was already replied)?

that is relative, anyway if you remove the webtorrent it will not be available only in the peers, if the ipfs content is pinned it will federate between other instances it will always be available somewhere on the p2p network

manalejandro avatar Dec 06 '20 13:12 manalejandro

Would be great for a peertube server to be able to talk with IPFS and allow anyone with a peertube account on the clear web to watch and connect to a video from IPFS or an unstoppable domain site.

trymeouteh avatar Jan 11 '21 23:01 trymeouteh

Being the peertube server, each instance would run a different IPFS repository where the pinned videos would be stored, but the storage is local so the videos would have to be read from the IPFS repository, once the hash of the video blocks is shared between the instances are shared and the content is replicated. I have seen how the current storage of peertube is programmed, it is not modular so it would have to be adapted in case a plugin was used. Also comment on the option of using IPNS as DNS resolution of the instances. You would have two P2P networks running on the clients (webtorrent) and another one between instances (IPFS). Yes it would be unstoppable.

manalejandro avatar Jan 13 '21 18:01 manalejandro

Good to get an update on mediating the storage issue with p2p solution on peertube? Are the any paths to this, is it on the agender?

Openmedianetwork avatar Feb 02 '21 11:02 Openmedianetwork

The fact is that we use WebTorrent (BitTorrent/WebRTC) on the client side to watch videos. It provides a handy pool of content seeders and direct browser connection. Watching a video via IPFS would mean to replace entirely that component with an IPFS client in the browser. So it's not just thinking of a different storage/upload mechanism.

I think they can both exist in parallel.

kotovalexarian avatar Mar 12 '21 11:03 kotovalexarian

I got it IPNS in the player :thinking:

https://manalejandro.com/~/Art%C3%ADculos/reproductor-p2p-con-enlace-ipns

We could choose IPFS or P2P player loader :wink: unstoppable

manalejandro avatar Jun 15 '21 18:06 manalejandro

It's hardly "unstoppable" if it's strictly less reliable than a single server and a single DNS entry. If you really want to avoid "censorship" (whatever that means to you) you actually need multiple reliable paths to serve the content. My experience with ipfs is that it is strictly less reliable than the network you're overlaying it on top of.

For what it's worth I'm not too optimistic about IPFS even after all the time I spent experimenting with video distribution. Ultimately, IPFS doesn't have a great content routing mechanism and it's extremely memory intensive, because it's not magic and all content-addressed storage is extremely memory intensive. Folks get optimistic about IPFS when it works well in small-scale tests, but that's mostly because in small tests you'll be able to keep enough open connections to just gossip your entire working set between hosts that already know about the data you are requesting.

None of it is private or efficient or hard to stop. I'd love to find a more efficient and reliable way to cluster storage together from multiple non-trusted hosts, but I've been thinking there does need to be at least a minimal trust relationship built around quality of service. I like the idea of peertube servers getting a "buddy" or two, with each server in each group capable of providing failover for the others.

scanlime avatar Jun 15 '21 23:06 scanlime

It's hardly "unstoppable" if it's strictly less reliable than a single server and a single DNS entry. If you really want to avoid "censorship" (whatever that means to you) you actually need multiple reliable paths to serve the content. My experience with ipfs is that it is strictly less reliable than the network you're overlaying it on top of.

For what it's worth I'm not too optimistic about IPFS even after all the time I spent experimenting with video distribution. Ultimately, IPFS doesn't have a great content routing mechanism and it's extremely memory intensive, because it's not magic and all content-addressed storage is extremely memory intensive. Folks get optimistic about IPFS when it works well in small-scale tests, but that's mostly because in small tests you'll be able to keep enough open connections to just gossip your entire working set between hosts that already know about the data you are requesting.

None of it is private or efficient or hard to stop. I'd love to find a more efficient and reliable way to cluster storage together from multiple non-trusted hosts, but I've been thinking there does need to be at least a minimal trust relationship built around quality of service. I like the idea of peertube servers getting a "buddy" or two, with each server in each group capable of providing failover for the others.

Only if you use DNS with IPNS it is not necessary :smiley: i mean, we have the player that works at half power and can give a lot more to choose from :play_or_pause_button: IPFS has more advantages than disadvantages.

manalejandro avatar Jun 16 '21 13:06 manalejandro

I think IPFS is really cool, but I am not so sure it is ready for this use case, at least with native js-ipfs. It is simply not fast or efficient enough.

Consider your demo (or my demo at https://bbb.fu.io), it take multiple seconds before the video even starts playing. My demo has the video seeded on dozen of ipfs nodes and the javascript is bootstrapped with an ipfs node that has the video on it; and it is still the case that it takes multiple seconds to load. Significantly longer on lower power devices, such as a mobile device.

My demo has multiple bit rates, and the javascript can not load fast enough to play 1080p without pausing to buffer every few seconds. The loader is smart enough to choose a lower bitrate so a user might not notice. But if the user chooses higher bitrate, they will not have a good time.

IPFS is getting there, but it is not ready for browser video playback just yet.

georgyo avatar Jun 16 '21 14:06 georgyo

I think IPFS is really cool, but I am not so sure it is ready for this use case, at least with native js-ipfs. It is simply not fast or efficient enough.

Consider your demo (or my demo at https://bbb.fu.io), it take multiple seconds before the video even starts playing. My demo has the video seeded on dozen of ipfs nodes and the javascript is bootstrapped with an ipfs node that has the video on it; and it is still the case that it takes multiple seconds to load. Significantly longer on lower power devices, such as a mobile device.

My demo has multiple bit rates, and the javascript can not load fast enough to play 1080p without pausing to buffer every few seconds. The loader is smart enough to choose a lower bitrate so a user might not notice. But if the user chooses higher bitrate, they will not have a good time.

IPFS is getting there, but it is not ready for browser video playback just yet.

It takes a while to connect and download the data but it works fine on my device :+1:

screenshot-android

manalejandro avatar Jun 16 '21 17:06 manalejandro

A use case I am thinking is for users to be able to imp import videos from the ipfs network. I don't think ipfs should be used for sending video to the client, but it might work well for server side storage. Expessially if the peertube server caches the video.

minecraftchest1 avatar Oct 19 '21 16:10 minecraftchest1

A use case I am thinking is for users to be able to imp import videos from the ipfs network. I don't think ipfs should be used for sending video to the client, but it might work well for server side storage. Expessially if the peertube server caches the video.

https://ffmpeg.org/ffmpeg-protocols.html#ipfs

manalejandro avatar Jul 25 '22 22:07 manalejandro

Another idea would be to use IPLD through IPNS, as this your IPFS settings will be from your DNS server so you won't have to code anything since PT won't see any URL changes. I talk about if you have your own IPFS cluster of course.

ROBERT-MCDOWELL avatar Jul 31 '22 11:07 ROBERT-MCDOWELL