OBOFoundry.github.io icon indicating copy to clipboard operation
OBOFoundry.github.io copied to clipboard

Submit OBO prefixes to prefix.cc

Open cmungall opened this issue 6 years ago • 7 comments

We release prefixes using shacl vocab here: https://raw.githubusercontent.com/OBOFoundry/OBOFoundry.github.io/master/registry/obo_prefixes.ttl

(still need a PURL for this...)

We are now able to submit OBO purls to http://prefix.cc which is maintained by @cygri (see also https://github.com/cygri/prefix.cc/issues/27). Rather than submit all manually it would be good to figure out an automatic sync.

Or maybe this is better done indirectly via identifiers.org or n2t.net? cc @jkunze

cmungall avatar Aug 19 '19 16:08 cmungall

Will prefix.cc also start allowing uppercase prefixes?

balhoff avatar Aug 19 '19 16:08 balhoff

It would be easy to load, if only to test, that batch of prefixes into n2t.net. I see some conflicts with existing prefixes in n2t.net, such as for CHEBI and DOID; normally, n2t resolves such conflicts by letting the earliest definition take precedence.

jkunze avatar Aug 19 '19 17:08 jkunze

What is the status of this?

nlharris avatar Mar 03 '20 23:03 nlharris

FYI, prefix.cc has been updated to allow underscores as the final character in a namespace URI.

cygri avatar Dec 30 '20 11:12 cygri

Thanks @cygri

How should we progress with synchronizing with prefix.cc (either directly from OBO, or from a registry that consumes from OBO, such as n2t or bioregistry.io?)? Is it possible to have some kind of semi-automated process, or do you recommend manual submission?

It would also be good to be able to batch update prefix.cc. For example, for GO on http://prefix.cc/context you have

"go": "http://purl.org/obo/owl/GO#",

This has not been in use for over a decade

there are other odd entries e.g "drugbank": "http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/",

cmungall avatar Jun 01 '21 15:06 cmungall

FYI here's the Python code you'd need to automate updating prefix.cc:

import requests
import bioregistry


def create(curie_prefix: str, uri_prefix: str) -> requests.Response:
    return requests.post(
        f"https://prefix.cc/{curie_prefix}",
        data={"create": uri_prefix},
    )


def main():
    prefix_cc_map = requests.get("https://prefix.cc/context").json()["@context"]
    for record in bioregistry.resources():
        if not record.get_obofoundry_prefix():
            continue
        uri_prefix = record.get_uri_prefix()
        if not uri_prefix:
            continue
        if uri_prefix == prefix_cc_map.get(record.prefix):
            # No need to re-create something that's already
            # both available and correct wrt Bioregistry/OBO
            continue

        print("Creating record for", record.prefix, uri_prefix)
        res = create(record.prefix, uri_prefix)
        print(res.text)

        # We're breaking here since we can only make one
        # update per day
        break


if __name__ == "__main__":
    main()

As of https://github.com/biopragmatics/bioregistry/pull/1056, this is now possible with python -m bioregistry.export.prefixcc

Unfortunately, there seems to be some kind of rate-limiting, and you can only post one per day. @cygri is there the possibility of adding some kind of authentication for trusted posters?

cthoyt avatar Feb 26 '23 20:02 cthoyt

I implemented the script from above in a GitHub action in https://github.com/biopragmatics/bioregistry/pull/1056 that will run nightly until it gets all of the OBO Foundry prefixes into Prefix.cc, then will continue with the remaining. Note that this is rate limited to a given IP address at 1 upload per day. The first successful CI run was on: https://github.com/biopragmatics/bioregistry/actions/runs/8325600134/job/22779554502

cthoyt avatar Mar 18 '24 14:03 cthoyt