police-brutality icon indicating copy to clipboard operation
police-brutality copied to clipboard

Identifiying Dead Links And Replacing Them

Open bonedaddy opened this issue 4 years ago • 9 comments

For now most links are actively and viewable, however we will inadvertently get dead links, such as those reported in https://github.com/2020PB/police-brutality/pull/392

While "dead" the data isn't lost as it will be captured by my archiver tool, we need a method for:

  1. Identifying dead links
  2. Replacing/supplementing dead links with the backups on IPFS

I'm not sure what the best method is, I suppose I can have some central listing place that I periodically post the new backup links to?

bonedaddy avatar Jun 07 '20 20:06 bonedaddy

I think we could use a torrent file so others can grab from your archive and create redundancy. Of course we need an ID system to ensure it's easy to grab from the file. This would also allow maintainers to re-upload.

We may want to use a combo of free services like streamable and image.fri so most folks can re-establish links as well as an AWS solution one of the maintainers could host from as suggested in another thread. A mix of centralized and decentralized.

ghost avatar Jun 08 '20 21:06 ghost

IPFS is somewhat like torrents in the sense that people can "seed" the data. There's a WIP PR I have going https://github.com/2020PB/police-brutality/pull/286 that contains the instructions on how to mirror the archive

bonedaddy avatar Jun 08 '20 23:06 bonedaddy

@bonedaddy I think we're going to need to have some of this backed up media links inside the repo directly. Especially when links die.

ubershmekel avatar Jun 11 '20 02:06 ubershmekel

@bonedaddy @ubershmekel may I propose a simple square bracket tag for identifying dead links? I just found one:

image

Murkantilism avatar Sep 08 '20 00:09 Murkantilism

I would make the language less morbid, but I agree. Perhaps something like

[original link that is now broken](https://example.com)

ubershmekel avatar Sep 08 '20 23:09 ubershmekel

@ubershmekel ah perhaps I chose a bad example, I meant more like this, with a whitespace separator:

[Dead] [Photojournalist's account](https://twitter.com/bfeinzimer/status/1277014331968782339)

To preserve the original context if trying to replace it. And yeah I'm fine with different language, something like [Broken] or [404].

Murkantilism avatar Sep 08 '20 23:09 Murkantilism

@Murkantilism I misread your example. At the moment I would prefer to keep the markdown syntax to keep the parser simple and fit the existing data structure at https://raw.githubusercontent.com/2020PB/police-brutality/data_build/all-locations-v2.json

ubershmekel avatar Sep 09 '20 05:09 ubershmekel

@ubershmekel ah good point! Maybe a pipe separator within the link markdown, something like this?

[Broken Link | Photojournalist's account](https://twitter.com/bfeinzimer/status/1277014331968782339)

Also, do we care about differentiating why a link is broken? ie: if the twitter account was deleted versus a genuine 404 page for example.

Murkantilism avatar Sep 09 '20 12:09 Murkantilism

@Murkantilism that looks fine by me.

On differentiating why broken - I'd be fine with either option. Though managing a nomenclature for such a system might be a bit much for a small project like ours.

ubershmekel avatar Sep 09 '20 15:09 ubershmekel