openMINDS_core icon indicating copy to clipboard operation
openMINDS_core copied to clipboard

adding a schema for platforms / infrastructures

Open lzehl opened this issue 9 months ago • 15 comments

... to provide informative metadata on the platforms and to group research products that belong to one platform

@olinux @annapaola there is overlap here of what you are planning at the moment.

Currently this connection can be done indirectly through

  • FileRepository:hostedBy->Organization
  • or through free text in the RPV description or version innovation fields

Platforms or infrastructures are not necessarily though organizations. E.g. EBRAINS RI is not equal to it's legal entity the EBRAINS AISBL.

@archgogo @MBAbrams this is also connected to the INCF portfolio

lzehl avatar Apr 01 '25 08:04 lzehl

What to consider? Here a first assessment:

DigitalPlatform

  • name (1)
  • alternateName (0-1)
  • acronym (1)
  • digitalIdentifier (0-N)
  • description (1)
  • type (1)
  • governance ?
    • legalEntity (1) Organization only ?
    • member (0-N) Organization only ?
    • userRegistration (0-1)
    • memberRegistration (0-1)
    • policy (1-N)
  • homepage (0-1)
  • jurisdiction (1) ? or is this given through the legal entity?
  • supportChannel
  • serviceOffer (1-N) (controlled: data deposit, data management, searchable resource, etc)
  • target (1-N) (linked category ?: species, experimental approaches, techniques, etc)
  • relatedTo (0-1) (other Platform)

RP(V)s can then link to it:

  • isPartOf

Note There is of course more but most of it is defined by the connected services (as also the serviceOffer). If no short cut is added here we would need to create all respective service instances for a platform. The strategy should be discussed first before further planning. Things to consider here:

  • data type restrictions

    • for data deposit
    • for searchable resource
    • for data retrieval
  • data format restrictions

    • for data deposit
    • for searchable resource
    • for data retrieval
  • standard restrictions

    • for data deposit
    • for searchable resource
    • for data retrieval
    • for any computational service
  • featured standards (if you use standards, you get additional features)

    • for data deposit
    • for searchable resource
    • for data retrieval
    • for any computational service

etc. (basically all of them are defined by the services)

lzehl avatar Apr 01 '25 13:04 lzehl

questions coming up in discussion with @Raphael-Gazzotti: if a legal entity is replaced by another legal entity, to whom do the old products belong to? would all of them need to replace the old legal entity with the new one? what if there is a transition phase where old and new legal entity coexist?

lzehl avatar Apr 02 '25 10:04 lzehl

questions coming up in discussion with @Raphael-Gazzotti: if a legal entity is replaced by another legal entity, to whom do the old products belong to? would all of them need to replace the old legal entity with the new one? what if there is a transition phase where old and new legal entity coexist?

This is a difficult one. It depends on the agreement between the entities on how the products should be referenced. Is there a specific use case you have in mind or do you want a general framework?

MBAbrams avatar Apr 02 '25 10:04 MBAbrams

@MBAbrams If we can I want to create a general framework that could be modular expanded into something more specific.

Doing more research I think we should not call it DigitalPlatform but DigitalSystem or actually DigitalInfrastructure. I think "platform" has too much overlap with "service" and we cover those differently.

lzehl avatar Apr 02 '25 15:04 lzehl

I'm still struggling getting order into a meaningful model here.

INFRASTRUCTURE

  • has resources:
    • human
    • storage
    • compute
    • network
    • data (incl software, research data, computational models, etc)
    • hardware
    • energy
    • other infrastructures
  • provides services:
    • human-centered services
    • technology-centered services

TECHNOLOGY-CENTERED SERVICE

  • has component:
    • at least one software in a certain installation
    • maybe other services (technology or human centered)
    • maybe other resources than software

HUMAN-CENTERED SERVICE

  • has component:
    • human resource
    • maybe other services (technology or human centered)
    • maybe other resources than human

SOFTWARE

  • maybe has component:
    • other software

If that would be the overall model connecting the essential components (leaving out now the components of other resources), we still need to figure out which property (e.g. storage types, data types, etc) is captured where in this model (with clear inheritance rules for information consolidation)

lzehl avatar Apr 02 '25 16:04 lzehl

I'm still struggling getting order into a meaningful model here.

INFRASTRUCTURE

  • has resources:

    • human
    • storage
    • compute
    • network
    • data (incl software, research data, computational models, etc)
    • hardware
    • energy
    • other infrastructures
  • provides services:

    • human-centered services
    • technology-centered services

TECHNOLOGY-CENTERED SERVICE

  • has component:

    • at least one software in a certain installation
    • maybe other services (technology or human centered)
    • maybe other resources than software

HUMAN-CENTERED SERVICE

  • has component:

    • human resource
    • maybe other services (technology or human centered)
    • maybe other resources than human

SOFTWARE

  • maybe has component:

    • other software

If that would be the overall model connecting the essential components (leaving out now the components of other resources), we still need to figure out which property (e.g. storage types, data types, etc) is captured where in this model (with clear inheritance rules for information consolidation)

Archana and I will discuss this during our next meeting and get back to you.

MBAbrams avatar Apr 03 '25 10:04 MBAbrams

Here a rough sketch (up for discussion) on a potential model/example (with EBRAINS representations in green; blue numbers indicate the allowed linkages; this is not fully openMINDS language... that still needs to be further specified):

Image

RP(V) = research product (version) SER(V) = service (version) [special RP(V)] LE = legal entity INF = infrastructure CON = consortium

lzehl avatar Apr 03 '25 13:04 lzehl

I'm unsure where FileRepository should be placed with this new modeling. Should it be a node between RP(V) and INF, with an optional property pointing to INF if the latter is not defined?

Raphael-Gazzotti avatar Apr 07 '25 06:04 Raphael-Gazzotti

The FileRepository will remain as link for the RP(V).

In addition, the RP(V) will use the INF as host for the data and should point to the LE of the INF (or the INF directly, TBD) as host of the FileRepository.

Important here is, that the FileRepository of a RP(V) could also be stored in a different INF than the RP(V) belongs to (e.g. code publication on Zenodo or GitHub, while software is a product integrated in EBRAINS).

Note: this submodel will definitely have dependencies with other parts of the model and should be validated accordingly (graph validation) at one point (for now we would need to write down what should not cause conflict for curation to take care of)

Another note: LE will not be a new schema. It is the Organization schema. But since we are using the Organization schema also for non-legal entities I needed to specify the condition in the diagram.

lzehl avatar Apr 07 '25 07:04 lzehl

Putting the figure into openMINDS language (with some more adaptations to better fit our model):

A ResearchProductVersion (RPV) is provided by an Organization. This linkage could be defined through contribution/contributor with the role of provider, or with a specified providedBy property (to enforce requirement).

An Organization may be a legal entity or not (should be specified in the type property of this schema). It may have a parent Organization or not. It may also be a member of another Organization or a Consortium or not (isMemberOf; optional). It is located in a Country.

A Consortium may be defined by one or multiple Organizations or Persons or not (isDefinedBy; optional).

A ServiceVersion (SV) is provided by an Organization (see RPV). It is also part of an Infrastructure (isPartOf; required). It implements at least one other ResearchProductVersion or other ServiceVersion or other Infrastructure (as short cut?). This linkage could be defined through hasPart (current property), or maybe renamed to isImplementing or implements.

An Infrastructure is provided by an Organization (cf. RPV for implementation).

NEW to what we have now:

RPV:

  • maybe a dedicated isProvidedBy property (value: required, 1-N, linked type Organization|Person|Consortium) parallel to the new contribution/contributor property (value: embedded type Contribution/ Contributor; cf #513)

For SV in addition to other RPV:

  • maybe replace hasPart with isImplementing (value: required, 1-N, linked type RPVs|Infrastructure)
  • add isPartOf property (value: required, 1(-N?), linked type Infrastructure)

For Organization cf #513

For Consortium

  • add isDefinedBy property (value: optional, 0-N, linked type Person|Organization)

Infrastructure would be entirely new schema. property and value constraints/expectations TBD

lzehl avatar Apr 09 '25 14:04 lzehl

Updated model from above (more towards openminds language now and with ebrains example; note that the example are not fully correct, don't take this too seriously)

Image Image

lzehl avatar Apr 11 '25 12:04 lzehl

first feedback from regular dev meeting by @openMetadataInitiative/openminds-developers

  • instead of using "implements" or "hasPart" in ServiceVersion maybe use "dependsOn" (matches feedback from @elenimath)
  • instead of using "isPartOf" for relating RPVs to INFRA turn this around to the INFRA with "hasPart" or "integrates"?
  • @apdavison specifically mentioned that we should revisit the publication extension to maybe defined in more detail the consortium relation to legal persons (organization or person)

lzehl avatar Apr 14 '25 13:04 lzehl

@MBAbrams @archgogo result of first discussion (aligning INCF infrastructure portfolio with openMINDS plans for infrastructures):

  • fullName
  • shortName
  • abbreviation
  • homepage
  • digitalIdentifier
  • operativeMode (controlled, TBD: web application, network application, desktop application, mobile application)
  • possible access model, and process?
    • mode: AccessMode [virtual, on-site]
    • type: AccessType [open, authorization-required, mediated]
    • payment: PaymentType [free of payment, one-time payment, recurring payment]
    • authorizationProcess: AuthorizationProcess [none, registration-only, admin-controlled, committee-controlled, owner-controlled, membership-controlled]
  • possible contribution model and process?
  • isProvidedBy (-> Organization) [jurisdiction is associated here indirectly]
  • usageRestriction ??? (TBD: species, data types, formats, storage, etc)
  • hasPart/integrates (-> RP(V) + HardwareProduct(Version)?)

embargo might be mentioned under service types?

lzehl avatar Apr 15 '25 08:04 lzehl

isProvidedBy arrays of Organization or Consortium access model / operative mode / usage restriction are properties that should be instead attached to Services possible contribution (IRI document/free text)

Raphael-Gazzotti avatar Nov 03 '25 13:11 Raphael-Gazzotti

REVISING THE WHOLE SEMANTICS TO FORM THE NEEDED MODEL:

DigitalPlatform:

  • description: A digital platform is a user-facing, coherent environment that provides and coordinates its own interfaces, services, and tools, which operate on top of and make use of the interfaces, services, and tools provided by one or multiple digital infrastructures. It may be built on top of one or more other digital platforms, forming nested or layered platform structures.

DigitalInfrastructure:

  • description: A digital infrastructure is the foundational technical resources (compute, storage, network, hardware, base systems) operated or coordinated by a legal person (natural person or legal entity), whose resources are always accessed via one or more interfaces that may be exposed directly by the infrastructure or mediated by a digital platform. It may consist of, depend on, or contain other digital infrastructures, forming nested or composite infrastructure structures.

A platform must obey the policies of the infrastructure(s) it runs on. A platform cannot override or contradict infrastructure-level policies. Platform policies extend but do not replace infrastructure policies.

A platform offers directly the following:

  • software (versions)
  • services

A platform offers indirectly the following:

  • interfaces for services via software
  • datasets via services

An infrastructure hosts the following:

  • deployed software & interfaces (incl platform software & interfaces)
  • datasets (stores)
  • generated artifacts (stores)

An infrastructure has the following components:

  • hardware
  • foundational system software
  • infrastructure-level interfaces

Image https://drive.google.com/file/d/123XoP5xT1Hv45M3E9o0MqzYqUeQhgqb4/view?usp=sharing

@apdavison @olinux ???

lzehl avatar Nov 20 '25 09:11 lzehl