vocadb icon indicating copy to clipboard operation
vocadb copied to clipboard

Archive external links when updating entries

Open Nefere256 opened this issue 4 years ago • 3 comments

Splitted from https://github.com/VocaDB/vocadb/issues/656.

VGMdb recently added link archiving feature:

Submitted and edited links are automatically queued for archiviation on the Wayback Machine, preventing important data and sources from being lost forever. This also applies to links used in the comment field mentioned above.

New links with archived pages have a Wayback Machine icon that links to an archived version of a website. An example with TouhouDB link.


An available API mentions only ways to lookup archived pages. Internet Archive blog post sugests using a save form on the main page of the project.

Nefere256 avatar Dec 04 '20 19:12 Nefere256

mentioned

ycanardeau avatar Nov 15 '21 14:11 ycanardeau

The Internet Archive has an email address for "Save Page Now!", [email protected]. It accepts 300 URLs per message ~~IIRC (someone test this)~~.

It sends a log back, either with an URL to web.archive.org or an error message, for each URL. Sometimes it will automatically retry if there are errors and send another log. Sometimes it will ignore your message (maybe a daily limit from one address).


There is also a Google Sheets-based service according to this article.


The "Save Page Now!" page on the IA website has a "Save outlinks" feature that is good, since important data is not necessarily on the given link (example: offvocal.zip is one more click away). ~~I don't think this can be done from the email address.~~if you add “capture outlinks” to the subject line, those will be preserved as well.

"Save Page Now!" has a limit on how many jobs you can run at once (usually around 3 maximum, maybe because I always choose "save outlinks"?).


Sending media links as well, not only external links, would be good for checking video data at the least (video description; as VocaDB/vocadb#1386). YouTube videos are sometimes functional in the Internet Archive. The other websites aren't (NND's old login wall, heavy reliance on JavaScript, etc), but archive.org accepts submissions apart from the Internet Archive (example: https://archive.org/details/soundcloud-343968920).

szc126 avatar Nov 15 '21 14:11 szc126

Finding out how VGMdb queues links could also be helpful.

szc126 avatar Nov 18 '21 07:11 szc126