ArchiveBot icon indicating copy to clipboard operation
ArchiveBot copied to clipboard

Cleaning up the forums igset

Open JustAnotherArchivist opened this issue 7 years ago • 0 comments

The forums igset is becoming quite messy:

  • It's large yet incomplete. There are currently 57 ignores in it, but this still doesn't cover many forum softwares, including some that are quite frequently used (e.g. modern vBulletin, at least some versions of WBB per #261, Discourse) – in other words, many more ignores should be in this igset.
  • It is difficult to see which ignores belong together, i.e. are for the same forum software.
  • It is inconsistent in that links to individual posts are ignored for some softwares (e.g. SMF) but not for others (e.g. vBulletin's showthread.php?t=123&p=456.
  • There are also some ignores in there which don't seem to be related to forum softwares at all, e.g. /ad\.pl and /tortoise\.pl added in 590714cd.

I'm thinking that we should consider splitting it up into individual igsets for each forum software, e.g. forums-smf, forums-vbulletin, forums-discourse. Maybe even further with versioning or an identifier when the software supports different URL schemes. What do you think about this?

JustAnotherArchivist avatar May 08 '18 14:05 JustAnotherArchivist