arcade-docs icon indicating copy to clipboard operation
arcade-docs copied to clipboard

Archive links for auction links

Open Poliwrath opened this issue 3 years ago • 5 comments

YAJ listing pages don't last very long (unsure how long eBay ones last), and aucfan/the billion other proxy sites that save YAJ pages aren't guaranteed to keep them. This could be extended for everything, but most sources (i.e iMp95's site) have been on the internet for ages, so the likelihood of them suddenly disappearing is probably low... Some japanese twitter users also purge old tweets.

Also closedsearch is an invaluable resource for finding <6 month old YAJ pages.

Poliwrath avatar Sep 07 '21 17:09 Poliwrath

Yep, agree we need some work here. I'm also a fan of using https://aucview.aucfan.com/yahoo/<auction ID> to be able to check deleted auctions/removed images, but even that has its limitations.

shizmob avatar Sep 10 '21 02:09 shizmob

Some YAJ sellers go the extra step to delete auction images after the listing is over (before the 6 month auto auction delete or whatever) fwiw. I don't exactly have a known list of sellers that do that on hand but I do know cyberdaioo does it sometimes.

Poliwrath avatar Sep 10 '21 02:09 Poliwrath

We already talked about a even more general solution of automatically crawling links regularly and persisting the files, e.g. using a github action cronjob that runs once a day.

One option to get this idea started might be to focus on the yahoo auction links first, and explore it with the limited scope. There might be a bunch of useful learnings from that before this can/should be scaled further.

voidderef avatar Sep 11 '21 23:09 voidderef

Any updates to this? I am already finding dead links in the Sega boards section.

biggestsonicfan avatar Oct 15 '21 23:10 biggestsonicfan

Yeah! I've been working on a small Python tool called aucscrape in the meantime. It supports finding Yahoo, eBay and Mercari auction links and retrieving and saving their metadata and media using either the original site or a number of mirror sites. Right now its scraping support is limited to Yahoo auctions, but I'm planning to add eBay and Mercari scraping when I can.

I've already ran it over the repository and stored all Yahoo auctions locally, so rest assured those are safe right now. More soon!

shizmob avatar Oct 20 '21 20:10 shizmob