metacatui icon indicating copy to clipboard operation
metacatui copied to clipboard

submission form should automatically fill in `publisher` information

Open jeanetteclark opened this issue 6 years ago • 9 comments

When a dataset is submitted to member node, the publisher information should automatically be inserted into the EML document on submission. This will make our metadata more Accessible according to the FAIR metadata quality suite.

We should include an organization identifier in the userId field. "ROR or GRID or WIKIDATA would be good" - @mbjones

jeanetteclark avatar Sep 10 '19 21:09 jeanetteclark

What kind of publisher metadata can we automatically insert? Doesn't that need to be provided by the user? Or should we automatically set it to the Member Node name?

laurenwalker avatar Sep 12 '19 17:09 laurenwalker

No, not provided by the user because the publisher is whatever member node they are publishing to.

It goes in eml/dataset/publisher as a responsibleParty, I imagine just with Arctic Data Center as the organization, and one of the above identifiers (although I cannot find what our organization identifier is in any of those systems for the life of me). @mbjones may be able to advise

jeanetteclark avatar Sep 12 '19 17:09 jeanetteclark

@laurenwalker ADC and other repositories may not be registered in those systems, but we need to get them there. I asked @gothub a few months ago to look into getting that rolling, and we ran into some timing issues with ROR. But GRID and Wikidata should be possible. Once they are in GRID, they should automatically end up in ROR.

mbjones avatar Sep 12 '19 18:09 mbjones

our wikidata identifier is: Q77285095

jeanetteclark avatar May 05 '20 17:05 jeanetteclark

I noted today that users can enter their own publisher information. It seems to me that they shouldn't be able to do this - the field is usually misinterpreted anyway. So as part of this issue I think we should consider removing that ability from the UI

jeanetteclark avatar May 14 '20 18:05 jeanetteclark

Agreed, or at least make it a config option as to whether it shows up.

mbjones avatar May 14 '20 18:05 mbjones

Here is a summary of the tasks required for this issue, as I understand it:

  • [ ] Insert repository info into EML documents in the Publisher field. The specific information that is inserted will be configurable for each repository.

option 1: we add a new config option that is specifically used to provide the Publisher information, e.g.

/**
 * Information about the repository that will be automatically inserted into
 * new EML metadata documents as the Publisher. This object can set any of the
 * fields that are available in the Responsible Party EML type, see
 * {@link https://github.com/NCEAS/eml/blob/main/img/eml-party.png},
 * @type {object}
 */
publisher: {
  organizationName: 'Arctic Data Center',
  userId: 'Q77285095',
  onlineUrl: 'https://arcticdata.io'
}

option 2: we could pull this information from other configuration options. organizationName = repositoryName, onlineUrl = baseUrl. We would just need to add a repositoryId for the userId, and a automaticallyFillPublisher (or similar) boolean option.

  • [ ] Add a showPublisherInEditor config option. When the publisher option is empty (option 1) or automaticallyFillPublisher is false (option 2), then check this showPublisherInEditor option to decide whether or not to display the "Publisher" role in the People section of the EML editor: publisher-in-metacatui

Questions

  1. Do we add this information: a. to new EML documents only? b. also to existing EML documents that have no publisher when they are edited?
  2. When the Publisher information is pre-filled, should we display this in the editor but make the fields un-editable? Or just keep it hidden behind the scenes?

@mbjones and @jeanetteclark, what do you think of this plan and do you have any feedback on these two questions? Thanks!

robyngit avatar Oct 19 '22 15:10 robyngit

To answer your questions @robyngit:

1. Do we add this information:
   a. to new EML documents only?
   b. also to existing EML documents that have no publisher when they are edited?

2. When the Publisher information is pre-filled, should we display this in the editor but make the fields un-editable? Or just keep it hidden behind the scenes?

Answers: 1a & 1b. We add publisher information to all new EML documents, as well as to existing EML documents that did not have that info listed when they are edited. So if someone were to edit a published dataset, we're still checking that the publisher information is there when we curate the new changes. If it's not, we'll add it.

  1. I think keeping that info hidden behind the scenes is the better route here. I can see us getting questions about why we include a non-editable response in the editor 😅 I think it would make sense(!), but I'd like to prevent those questions if possible 🙂

dvirlar2 avatar May 18 '23 20:05 dvirlar2

Here's our current metadata entry for the ADC, listing our various identifiers and other info from our schema.org entry on our home page:

ADC schema.org entry
{
  "@context": [
    "https://schema.org/"
  ],
  "@type": [
    "Service",
    "Organization",
    "ResearchProject"
  ],
  "@id": "https://arcticdata.io",
  "name": "Arctic Data Center",
  "legalName": "Arctic Data Center",
  "alternateName": "ADC",
  "logo": "https://arcticdata.io/wp-content/themes/aurora/library/images/logo_.png",
  "url": "https://arcticdata.io",
  "description": "The Arctic Data Center is the primary data and software repository for the Arctic section of NSF Polar Programs.",
  "identifier": [
    {
      "@type": "PropertyValue",
      "name": "ROR:055hrh286",
      "propertyID": "https://registry.identifiers.org/registry/ror",
      "value": "ror:055hrh286",
      "url": "https://ror.org/055hrh286"
    },
    {
      "@type": "PropertyValue",
      "name": "Re3data DOI: 10.17616/R37P98",
      "propertyID": "https://registry.identifiers.org/registry/doi",
      "value": "doi:10.17616/R37P98",
      "url": "https://doi.org/10.17616/R37P98"
    },
    {
      "@type": "PropertyValue",
      "name": "wikidata:Q77285095",
      "propertyID": "https://registry.identifiers.org/registry/wikidata",
      "value": "wikidata:Q77285095",
      "url": "https://www.wikidata.org/wiki/Q77285095"
    },
    {
      "@type": "PropertyValue",
      "name": "grid:grid.507882.0",
      "propertyID": "https://registry.identifiers.org/registry/grid",
      "value": "grid:grid.507882.0",
      "url": "https://www.grid.ac/institutes/grid.507882.0"
    }
  ],
  "sameAs": [
    "https://ror.org/055hrh286",
    "https://www.grid.ac/institutes/grid.507882.0",
    "https://www.wikidata.org/wiki/Q77285095",
    "https://www.re3data.org/repository/r3d100011973",
    "http://doi.org/10.17616/R37P98",
    "urn:node:ARCTIC"
  ],
  "category": [
    "Arctic Research"
  ],
  "provider": {
    "@id": "https://arcticdata.io"
  },
  "contactPoint": {
    "@type": "ContactPoint",
    "name": "Support",
    "email": "[email protected]",
    "url": "https://arcticdata.io/support/",
    "contactType": "customer support"
  },
  "foundingDate": "2016-02-01",
  "funder": {
    "@type": "Organization",
    "@id": "https://doi.org/10.13039/100000087",
    "legalName": "Office of Polar Programs",
    "alternateName": "OPP",
    "url": "https://www.nsf.gov/div/index.jsp?div=OPP",
    "identifier": {
      "@type": "PropertyValue",
      "propertyID": "https://registry.identifiers.org/registry/doi",
      "value": "doi:10.13039/100000087",
      "url": "https://doi.org/10.13039/100000087"
    },
    "parentOrganization": {
      "@type": "Organization",
      "@id": "https://doi.org/10.13039/100000085",
      "legalName": "Directorate for Geosciences",
      "alternateName": "NSF-GEO",
      "url": "http://www.nsf.gov",
      "identifier": {
        "@type": "PropertyValue",
        "propertyID": "https://registry.identifiers.org/registry/doi",
        "value": "10.13039/100000085",
        "url": "https://doi.org/10.13039/100000085"
      },
      "parentOrganization": {
        "@type": "Organization",
        "@id": "https://doi.org/10.13039/100000001",
        "legalName": "National Science Foundation",
        "alternateName": "NSF",
        "url": "http://www.nsf.gov",
        "identifier": {
          "@type": "PropertyValue",
          "propertyID": "https://registry.identifiers.org/registry/doi",
          "value": "10.13039/100000001",
          "url": "https://doi.org/10.13039/100000001"
        }
      }
    }
  },
  "hasOfferCatalog": {
    "@type": "OfferCatalog",
    "name": "Arctic Data Center Data Catalog",
    "itemListElement": [
      {
        "@type": "DataCatalog",
        "@id": "https://arcticdata.io/catalog/data",
        "name": "Arctic Data Center Catalog",
        "audience": {
          "@type": "Audience",
          "audienceType": "public",
          "name": "General Public"
        }
      }
    ]
  },
  "address": {
    "@type": "PostalAddress",
    "streetAddress": "1021 Anacapa Street",
    "addressLocality": "Santa Barbara",
    "addressRegion": "CA",
    "postalCode": "93101",
    "addressCountry": "US"
  },
  "parentOrganization": {
    "@type": "Organization",
    "@id": "https://ror.org/0146z4r19",
    "legalName": "National Center for Ecological Analysis and Synthesis",
    "alternateName": "NCEAS",
    "url": "http://nceas.ucsb.edu",
    "identifier": {
      "@type": "PropertyValue",
      "propertyID": "https://registry.identifiers.org/registry/ror",
      "value": "ror:0146z4r19",
      "url": "https://ror.org/0146z4r19"
    },
    "parentOrganization": {
      "@type": "Organization",
      "@id": "https://ror.org/02t274463",
      "legalName": "University of California, Santa Barbara",
      "alternateName": "UCSB",
      "url": "http://ucsb.edu",
      "identifier": {
        "@type": "PropertyValue",
        "propertyID": "https://registry.identifiers.org/registry/ror",
        "value": "ror:02t274463",
        "url": "https://ror.org/02t274463"
      }
    }
  },
  "inLanguage": "en-US",
  "addressCountry": "US",
  "license": [
    "http://spdx.org/licenses/CC0-1.0",
    "https://spdx.org/licenses/CC-BY-4.0"
  ],
  "credentialCategory": "CoreTrustSeal",
  "termsOfService": [
    "http://spdx.org/licenses/CC0-1.0",
    "https://spdx.org/licenses/CC-BY-4.0"
  ],
  "ex:persistentIdentifiers": [
    "https://registry.identifiers.org/registry/doi",
    "https://registry.identifiers.org/registry/orcid",
    "https://registry.identifiers.org/registry/ror",
    "https://registry.identifiers.org/registry/rrid",
    "https://registry.identifiers.org/registry/d1id",
    "https://registry.identifiers.org/registry/ark"
  ],
  "ex:machineInteroperability": [
    "DataONE",
    "OAI-PMH",
    "DataCite",
    "REST",
    "SPARQL"
  ],
  "ex:metadata": [
    "EML",
    "ISO-19115",
    "DDI",
    "Dublin Core",
    "FGDC CSDGM",
    "METS",
    "DataCite",
    "OAI-ORE",
    "other"
  ],
  "ex:curation": "https://arcticdata.io/submit/",
  "ex:preservationPolicy": "https://arcticdata.io/preservation/",
  "ex:termsOfAccess": [
    "http://spdx.org/licenses/CC0-1.0",
    "https://spdx.org/licenses/CC-BY-4.0"
  ]
}

I'm guessing the ROR is the best identifier to use these days.

mbjones avatar Nov 28 '23 19:11 mbjones