SponsorBlock icon indicating copy to clipboard operation
SponsorBlock copied to clipboard

Make a P2P network for SponsorBlock database

Open jcastro opened this issue 3 years ago • 13 comments
trafficstars

I LOVE this service and would like to contribute, I don't know how to developer but I do have some 24/7 servers that I could use to host an instance of the app? Would that be possible? Or somehow have the DB somewhere where we can all update it so the service would go barely offline

jcastro avatar Nov 02 '22 10:11 jcastro

yes please. being able to host a local copy of the db would be nice too, or at least cache certain channels so it doesnt become useless when the server goes down.

having an icon or something that says when it's down would be nice too, instead of just silently failing.

skeddles avatar Nov 02 '22 13:11 skeddles

https://github.com/mchangrh/sb-mirror

ajayyy avatar Nov 03 '22 00:11 ajayyy

@ajayyy this is awesome! thanks so much. I was wondering if it's possible to add several address to 'SponsorBlock Server Address' config in the extension settings? so we can have several mirrors added there as well

image

jcastro avatar Nov 03 '22 02:11 jcastro

I've found this issue because I've got an error related to HTTP server hosting sponsor.ajay.app being temporarily unavailable:

2022-11-03_16-19

One way to mitigate this, is to put the DB on IPFS (as immutable snapshots) and set up either IPNS and/or DNSLink for publishing updates pointing at the latest version.

This not only allows P2P retrieval and makes it easier for people to co-host the DB, but also provides an HTTP CDN for regular browsers thanks to public gateways

Example (DNSLink):

  • If HTTP server hosting https://en.wikipedia-on-ipfs.org is down, one can still load content P2P via IPFS using the value from DNSLink TXT record, or use any public gateway as a fallback mirror:
    • https://dweb.link/ipns/en.wikipedia-on-ipfs.org
    • https://cf-ipfs.com/ipns/en.wikipedia-on-ipfs.org (Cloudflare)

Going IPFS route has a nice property of being backward and forward compatible:

  • Regular browsers still benefit from having multiple mirrors that can be used as a fallback if the original server is down.
  • Browsers with built-in IPFS support (like Brave) or IPFS Companion browser extension+Desktop app will be able to do opportunistic protocol upgrade and load DB P2P.

I am working on IPFS, happy to answer any questions if any of the above feels useful.

lidel avatar Nov 03 '22 15:11 lidel

@jcastro I've just submitted PR #1572 to add this - you can build it from my fork if you don't want to wait until it's merged :)

lewisdoesstuff avatar Nov 03 '22 18:11 lewisdoesstuff

@jcastro I've just submitted PR #1572 to add this - you can build it from my fork if you don't want to wait until it's merged :)

beautiful! thanks so much

jcastro avatar Nov 03 '22 19:11 jcastro

I am interested in making the database accessible over IPFS. This will require a lot less bandwidth for mirrors, as everyone would share the same files, instead of rsync mirrors which are centralized.

@lidel are you aware of any project that can follow an IPNS link and keep a mirror (even as the cid updates) of it?

This would be helpful for mirror projects like https://github.com/TeamPiped/sponsorblock-mirror and would save bandwidth for everyone.

I'm also interested in any sort of potential real-time streaming of events, which can be used to keep mirrors up-to-date at real time. (Without polling for database updates every hour)

FireMasterK avatar Nov 05 '22 02:11 FireMasterK

When looking into ideas for sb-mirror, rsync was chosen as it handled deduplication best. The way the csv's are modified makes ipfs impractical as most of the file needs redownloading

ajayyy avatar Nov 05 '22 02:11 ajayyy

For reference, read https://github.com/ajayyy/SponsorBlockServer/issues/373 which lists the alternatives considered and reasonings

ajayyy avatar Nov 05 '22 03:11 ajayyy

When looking into ideas for sb-mirror, rsync was chosen as it handled deduplication best. The way the csv's are modified makes ipfs impractical as most of the file needs redownloading

IPFS has 0 version control, none of the chunks were deduplicated when uploading a new version since their chunking is dumb, the views would return a different hash even for earlier segments.

storing a single file would take up 120% of the space and take my machine about 10 minutes to chunk, hash and split, (R5 3600 + nvme). uploading subsequent version of the files would be under new addresses in the same folder, but use up another 120% of space.

mchangrh avatar Nov 05 '22 19:11 mchangrh

https://docs.google.com/document/d/1YZwsVustQxC8sXsYqsq_sWZ6xnOx01aftjGclrW9zeI/edit?usp=sharing

quick write-up as to why rsync was chosen and why p2p was quickly disregarded

mchangrh avatar Nov 05 '22 20:11 mchangrh

some mirrors are read only, mirrors are also third party. so if you are not getting segments on new videos make sure to disable the mirrors. as for time as writing SB main server works fine.

Splarkszter avatar Nov 09 '22 01:11 Splarkszter

I think a good solution would be to have many servers in different locations and have the extension query the nearest API server

ccuser44 avatar Jun 19 '23 07:06 ccuser44