CKAN
CKAN copied to clipboard
[Feature] Netkan source for forum threads
Problem
~~SearchAndRescue has had errors on http://status.ksp-ckan.org/ for some time now, and the latest indexed version is out of date.~~ (Fixed by KSP-CKAN/NetKAN#6092)
This mod is hosted on DropBox because the author prefers not to use SpaceDock or GitHub. This requires manual maintenance of the metadata. (There's a SearchAndRescue.netkan file, which essentially just automates the process of populating download_size
and download_hash
, because everything else has to be filled in manually.)
Suggestion
I was trying to think of ways to improve this, and hit upon the idea of trying to get download links from forum threads, with a value like this in a netkan file:
"$kref": "#/ckan/forum/123456-Topic/dropbox.com",
Proposed format, broken out by pieces of text between forward slashes:
- Standard
#/ckan
kref prefix -
forum
to indicate the link is on a KSP forum thread -
123456-Topic
to indicate the thread-specific part of the thread's URL, to be appended tohttps://forum.kerbalspaceprogram.com/index.php?/topic/
-
dropbox.com
to specify a link search string to be matched
Netkan could:
- Download the HTML for the forum thread (or even better, use an API if one exists)
- Parse it looking for links
- Return the first link that matches the search string from the kref
- Download and process the file as normal to generate a ckan file
This might be somewhat more automated than the current process for a mod like SearchAndRescue.
Caveats
This method would probably be a bit error-prone. It would be sensitive to the exact formatting of a post; an author might rearrange their list of downloads and find that the wrong ones were now being checked. But as long as the requirements were simple and clear, it ought to be possible to keep a thread formatted in a parseable way.
Less clear are the expectations that users might develop. Authors might expect that dependencies or version requirements could be pulled from their threads, which probably isn't feasible given the requirement of free form natural language processing. We could try inventing a simplified metadata language for specifying such things, but that could turn this into a very large project with requirements for reporting syntax errors, etc.
CKAN's currently indexed downloads are overwhelmingly on SpaceDock, GitHub, and archive.org:
However, since nearly all mods have forum threads, some authors may be tempted to change their mods' metadata to check the forum thread. Obviously this should be avoided whenever possible; a forum thread should only be used for DropBox-style hosts that have no formal organization of releases.
Another interesting possibility is a $vref
for forum threads. Rather than trying to get all of the info about a mod from the forum, we could get most of it from the host with an existing $kref
, and then just use the forum thread as the authoritative source for version info. This could have problems with getting out of sync, though, if a modder uploads a new version and forgets to update the forum thread for example.
This could have problems with getting out of sync, though, if a modder uploads a new version and forgets to update the forum thread for example.
We could solve that the same way KSP-AVC does: require the mod version to match. The forum $vref
could stipulate a format for the title (and presumably raise a warning if it doesn't fit):
[Min–Max] Mod Name v1.2.3
Then if the mod version is the same as what we're inflating, we use the compatibility from the title, otherwise we don't. That way we could be sure that we weren't applying it to the wrong version.
I might look for some mods without $vref
s but with nicely formatted forum thread titles on which to pilot this...
from git import Repo
from netkan.repos import NetkanRepo, CkanMetaRepo
nkr = NetkanRepo(Repo('/Users/User/github/NetKAN'))
ckmr = CkanMetaRepo(Repo('/Users/User/github/CKAN-meta'))
[ck.resources['homepage']
for ck in (max(ckmr.ckans(nk.identifier), default=None, key=lambda ck: ck.version)
for nk in nkr.netkans()
if not nk.has_vref and not nk.on_netkan)
if hasattr(ck, 'resources') and 'remote-avc' not in ck.resources and ck.resources.get('homepage', '').startswith('https://forum.kerbalspaceprogram.com')]
Out of the first 100 candidate mods, three use a suitable title format:
- https://forum.kerbalspaceprogram.com/index.php?/topic/205097-1122-anglecan-part-tweaks-10/
- https://forum.kerbalspaceprogram.com/index.php?/topic/204809-122-anglecan-progression-10/
- https://forum.kerbalspaceprogram.com/index.php?/topic/87463-173-community-delta-v-map-27/
Most either omit the version or include extraneous text after it.