CKAN icon indicating copy to clipboard operation
CKAN copied to clipboard

[Feature] Netkan source for forum threads

Open HebaruSan opened this issue 6 years ago • 4 comments

Problem

~~SearchAndRescue has had errors on http://status.ksp-ckan.org/ for some time now, and the latest indexed version is out of date.~~ (Fixed by KSP-CKAN/NetKAN#6092)

This mod is hosted on DropBox because the author prefers not to use SpaceDock or GitHub. This requires manual maintenance of the metadata. (There's a SearchAndRescue.netkan file, which essentially just automates the process of populating download_size and download_hash, because everything else has to be filled in manually.)

Suggestion

I was trying to think of ways to improve this, and hit upon the idea of trying to get download links from forum threads, with a value like this in a netkan file:

    "$kref": "#/ckan/forum/123456-Topic/dropbox.com",

Proposed format, broken out by pieces of text between forward slashes:

  • Standard #/ckan kref prefix
  • forum to indicate the link is on a KSP forum thread
  • 123456-Topic to indicate the thread-specific part of the thread's URL, to be appended to https://forum.kerbalspaceprogram.com/index.php?/topic/
  • dropbox.com to specify a link search string to be matched

Netkan could:

  1. Download the HTML for the forum thread (or even better, use an API if one exists)
  2. Parse it looking for links
  3. Return the first link that matches the search string from the kref
  4. Download and process the file as normal to generate a ckan file

This might be somewhat more automated than the current process for a mod like SearchAndRescue.

Caveats

This method would probably be a bit error-prone. It would be sensitive to the exact formatting of a post; an author might rearrange their list of downloads and find that the wrong ones were now being checked. But as long as the requirements were simple and clear, it ought to be possible to keep a thread formatted in a parseable way.

Less clear are the expectations that users might develop. Authors might expect that dependencies or version requirements could be pulled from their threads, which probably isn't feasible given the requirement of free form natural language processing. We could try inventing a simplified metadata language for specifying such things, but that could turn this into a very large project with requirements for reporting syntax errors, etc.

CKAN's currently indexed downloads are overwhelmingly on SpaceDock, GitHub, and archive.org:

image

However, since nearly all mods have forum threads, some authors may be tempted to change their mods' metadata to check the forum thread. Obviously this should be avoided whenever possible; a forum thread should only be used for DropBox-style hosts that have no formal organization of releases.

HebaruSan avatar Dec 13 '17 20:12 HebaruSan

Another interesting possibility is a $vref for forum threads. Rather than trying to get all of the info about a mod from the forum, we could get most of it from the host with an existing $kref, and then just use the forum thread as the authoritative source for version info. This could have problems with getting out of sync, though, if a modder uploads a new version and forgets to update the forum thread for example.

HebaruSan avatar Nov 18 '18 18:11 HebaruSan

This could have problems with getting out of sync, though, if a modder uploads a new version and forgets to update the forum thread for example.

We could solve that the same way KSP-AVC does: require the mod version to match. The forum $vref could stipulate a format for the title (and presumably raise a warning if it doesn't fit):

[Min–Max] Mod Name v1.2.3

Then if the mod version is the same as what we're inflating, we use the compatibility from the title, otherwise we don't. That way we could be sure that we weren't applying it to the wrong version.

I might look for some mods without $vrefs but with nicely formatted forum thread titles on which to pilot this...

HebaruSan avatar Apr 09 '22 22:04 HebaruSan


from git import Repo
from netkan.repos import NetkanRepo, CkanMetaRepo

nkr = NetkanRepo(Repo('/Users/User/github/NetKAN'))
ckmr = CkanMetaRepo(Repo('/Users/User/github/CKAN-meta'))

[ck.resources['homepage']
 for ck in (max(ckmr.ckans(nk.identifier), default=None, key=lambda ck: ck.version)
            for nk in nkr.netkans()
            if not nk.has_vref and not nk.on_netkan)
 if hasattr(ck, 'resources') and 'remote-avc' not in ck.resources and ck.resources.get('homepage', '').startswith('https://forum.kerbalspaceprogram.com')]

HebaruSan avatar Apr 09 '22 23:04 HebaruSan

Out of the first 100 candidate mods, three use a suitable title format:

  • https://forum.kerbalspaceprogram.com/index.php?/topic/205097-1122-anglecan-part-tweaks-10/
  • https://forum.kerbalspaceprogram.com/index.php?/topic/204809-122-anglecan-progression-10/
  • https://forum.kerbalspaceprogram.com/index.php?/topic/87463-173-community-delta-v-map-27/

Most either omit the version or include extraneous text after it.

HebaruSan avatar Apr 10 '22 01:04 HebaruSan