image-spec icon indicating copy to clipboard operation
image-spec copied to clipboard

Representation of "Container" for Bioschema, and Ontology

Open vsoch opened this issue 6 years ago • 9 comments

hey opencontainers team!

I'm working on creating a representation for a "Container" in bioschemas, the goal being that this feeds into schema.org, and schema.org is the bread and butter powering the Google Datasets search. So you can imagine, being able to represent containers and then have Google discover them across these webby places and index is going to be very important for curation and search of (what is currently) a messy universe of peanut butter and jelly container piles.

Toward this goal, I'm taking a first shot at this addition. Importantly, I want the specification to be entirely derived from the specifications here, and start out by modeling the higher level things. The work is early, but I want to ask you about some mappings (if you think they are reasonable to start). Remember that we can of course tweak these once the initial draft is done! Here we go!

Organization

Please see here for the organization I think is logical for the root of a Container, and also the modeling of a ContainerRecipe and ContainerRegistry (which now I'm thinking might make sense to mirror OCI and be called ContainerDistribution. For easy reference, here is the overall first plan I'm working on:

Thing > CreativeWork > SoftwareApplication > Container > ContainerImage
                                                         ContainerRuntime
                                                         ContainerRecipe
                                                         ContainerDistribution

where the list there would match the Image, runtime, (no match for recipe, but important to model because we have repositories of just Recipes, for example?), and then distribution. I will have a proper drafty thing for you to critique this in detail, for now I just want to ask about particular mappings (below).

Ontology

As far as I can tell, we don't have a container ontology. The closest I found (thanks @satra !) is this one that has virtualization. I reached out to them to see if they had thoughts about modeling containers there, but we would want uris to reference in this schema. Does anyone know anything about options here? For a lot of the below, I'm wondering if we need to tackle this (somewhat scary and uncharted) domain of creating a simple container technology based off the specs here.

Features

The "SoftwareApplication" has a field for "FeatureList." In the image spec I noticed general features (at the bottom) for signatures and federated naming. Is that a reasonable start?

featureList

  • Text or URL
  • Features or modules provided by this application (and possibly required by other applications).

We would want to say "Please choose a value from EDAM:Operation.

And then the example would be like:

"{
  ""@type"": [""SoftwareApplication"", ""Container""],
  ""featureList"": [""http://oci.org/ontology/topic_0821"", 
                    ""http://oci.org/ontology/topic_0821""]
}"

Let's start with just this simple question for now - I'm just looking over the parent terms and attributes now (from SoftwareApplication and CreativeWork) and will tackle the Container specific ones next (from the standards here) and I'll update this issue as I run into questions. In the meantime, please let me know your thoughts on the above, and please excuse my ontology noobness, just learning this stuff :)

vsoch avatar Sep 17 '18 23:09 vsoch

Quick update: started ContainerRecipe draft. This one is harder because it doesn't parallel a spec here - I'll be doing the Image/Distribution and others soon, will post here when I do.

vsoch avatar Sep 25 '18 15:09 vsoch

hey oci maintainers! Is this the right place to have this discussion? I now have generated the following:

and I'd like to get feedback, if possible (and even minimal!) so I can move forward with submission to schema.org. The goal is to be able to have these definitions to tag containers (and their associated stuffs like recipe files) in webby places, and then a crawler like Google Dataset Search can index them. My strategy has been to follow the OCI specs (except for ContainerRecipe since that doesn't seem to have one?) so any feedback you might have would be greatly appreciated! The specifications repository is here --> https://github.com/openschemas/specifications and the rendered versions that it serves are linked above. And once it's submit, I'm going to take a shot at working with biocontainers to tag their containers, of course Singularity Hub, and a repo of Dockerfiles that I made, because :)

Looking forward to your feedback!

vsoch avatar Sep 26 '18 09:09 vsoch

hello... anyone out there? :(

vsoch avatar Oct 09 '18 17:10 vsoch

ping @opencontainers/image-spec-maintainers + @vbatts :)

@vsoch I'd consider emailing the OCI dev mailing list too: https://groups.google.com/a/opencontainers.org/forum/#!forum/dev

@vsoch there's an OCI meeting tomorrow I invited you too

caniszczyk avatar Oct 09 '18 17:10 caniszczyk

Thanks @caniszczyk 😊 I’d like to discuss how the work being done here might be integrated into schema.org so we can have containers show up nicely from registries, personal pages, etc in Google Datasets (or similar) search. The entrypoint seems to be scheme.org but the expertise is all here!

vsoch avatar Oct 09 '18 17:10 vsoch

@vsoch I'm a little confused on what youre actually proposing. What problems does this solve? What changes are required in OCI?

stevvooe avatar Oct 10 '18 13:10 stevvooe

No changes in OCI! The idea is that the simplified representation for schema.org should be derived from OCI, where the expertise is.

vsoch avatar Oct 10 '18 14:10 vsoch

Google Dataset Search uses schema.org, so the idea is to get a representation of a container in there so we can eventually tag containers and have the universe of them searchable! See -->https://www.blog.google/products/search/making-it-easier-discover-datasets/ for the first work with Datasets. I don't think we need to have all the details (attributes) from OCI, but rather the higher level ones, and this is what I'd like to discuss - what from OCI should map over to schema.org. I'm glad to do the work to generate the mapping, but the second (important) bit is to get support from the community here for the specification there.

vsoch avatar Oct 10 '18 14:10 vsoch

https://github.com/schemaorg/schemaorg/issues/2059

vsoch avatar Oct 10 '18 21:10 vsoch