science-on-schema.org icon indicating copy to clipboard operation
science-on-schema.org copied to clipboard

Use of "citation"

Open mathiasbockwoldt opened this issue 6 years ago • 17 comments

The field "citation" in the examples is not necessarily used in the same way as intended by schema.org. It seems that in the examples, it is used as a string how to cite the given dataset. However, on schema.org, it is rather defined as a reference to other creativeWorks. There might be a new field called e.g. "citeAs" that contains the information. The explanation for this field should suggest using some identifier like a DOI. The string (as also used in the examples here) is not very useful, since different journals have different ways to cite datasets (is it e.g. "J. Smith" or "Smith, J." or "Joe Smith" or "Smith, Joe"?). I/we (Polar Data Forum III) suggest to provide a DOI if available, refer to an object with author, title, etc (maybe already given in another part of the metadata), or, as a last resort, give a citation string.

mathiasbockwoldt avatar Nov 20 '19 14:11 mathiasbockwoldt

Leyla Garcia from bioschemas.org is reaching out to contacts at schema-org about the definition change

ashepherd avatar Dec 19 '19 20:12 ashepherd

see: https://github.com/schemaorg/schemaorg/issues/2325

ashepherd avatar Dec 19 '19 20:12 ashepherd

https://developers.google.com/search/docs/data-types/dataset used to define this field as "Preferred citation for this dataset", but it has been updated to say, "Identifies academic articles that are recommended by the data provider be cited in addition to the dataset itself. Provide the citation for the dataset itself with other properties, such as name, identifier, creator, and publisher properties."

I think it's safe to begin a pull request to update the guidance document on how to properly use this field.

ashepherd avatar Jan 09 '20 20:01 ashepherd

Following https://github.com/schemaorg/schemaorg/issues/1031

ashepherd avatar Feb 03 '20 15:02 ashepherd

https://github.com/schemaorg/schemaorg/issues/1031 doesn't seem to resolve the question of what schema:citation is supposed to mean. The so:citation scope note "A citation or reference to another creative work, such as another publication, web page, scholarly article, etc." is not very useful, essentially 'a citation is a citation'.
The comment by ljgarcia. Presents two options:

  • the CreativeWork is a publication about that Dataset
  • the CreativeWork references the Dataset because it makes a claim based on data contained in that Dataset

And this discussion notes a third potential interpretation:

  • Preferred citation for this dataset

SOSO should pick one and recommend that. I suggest using "so:citation: a reference to a resource made because the CreativeWork makes a claim based on data contained in that resource" (modified from @ljgarcia second option. ) Given that the issue about this in schema.org issue tracker from 2016 has gone nowhere, I don't think we should count on any updates to schema.org.

smrgeoinfo avatar Aug 24 '20 23:08 smrgeoinfo

@smrgeoinfo thanks for the clarifications, I agree they are important. For some more context, in EML, we have fields for all three of those concepts. They are:

  • usageCitation: used whenever a work uses or incorporates the dataset; this is the 'claim is based on' case, and is a traditional citation in that regard (we would map this to the DataCite citedBy property)
  • referencePublication: used when there is a canonical citation that should be used to represent this dataset; I think this is the 'preferred citation' case you list above
  • literatureCited: this is a list of related works that are cited in some way by the dataset (often as part of the background/context of the dataset metadata)

I think it would be good to differentiate at least these three roles of citation references in the so:citation clarifications.

mbjones avatar Sep 21 '20 20:09 mbjones

Difficulty: Easy

positives

  • will address a recent fix to current guidelines
  • provide clarity on its meaning and usefulness as opposed to its previous definition (for describing the Dataset's own citation)
  • used by Google Dataset Search tool to link to Google Scholar

negatives

  • does not address how specify the type of relationship between the Dataset and the cited CreativeWork

+1 to include in v1.3

ashepherd avatar Feb 02 '21 19:02 ashepherd

As a reference, it looks like Datacite is taking the more specific properties from its schema (References and Cites) and aggregates them into this so:citation property. Not sure if there are other rules applied so maybe @mfenner could describe what their algorithm is?

ashepherd avatar Jul 19 '21 14:07 ashepherd

@mbjones will reach out to Martin Fenner at DataCite about their algorithm

ashepherd avatar Jul 23 '21 20:07 ashepherd

Discussed possibly the ESIP schema.org cluster managing a vocabulary of dataset relations (mirroring DataCite Schema relation types)

ashepherd avatar Jul 23 '21 20:07 ashepherd

re: Garza mention of how Datacite uses schema:citation

https://github.com/ESIPFed/science-on-schema.org/issues/128#issuecomment-888458367

ashepherd avatar Aug 02 '21 21:08 ashepherd

We had discussed possibly using LinkRole to specify relationship of the object of the citation. here's an example, using the DataCite relationship terms in the linkRelationships text value. 'roleName' value is text or URL, so if there is a URI for the relationship that could go there.

{
 "@context": "https://schema.org/",
 "@type": "Dataset",
  "citation": [{
    "@type":"CreativeWork",
	"url": {
	"@type":"LinkRole",
		"url":"https://www.example.com/articlethatUsesDataset",
		"description":"link to publication that bases scientific conclusions on analysis using the dataset", 
		"roleName":"https://eml.ecoinformatics.org/whats-new-in-eml-2-2-0.html#usage-citations"
		"linkRelationship":"IsCitedBy"
		},
		
		{
    "@type":"CreativeWork",
	"url": {
	    "@type":"LinkRole",
		"url":"https://www.example.com/articlethatCommentsOndataset",
		"description":"link to a publication that comments on/discusses the dataset",
		"roleName":"https://eml.ecoinformatics.org/whats-new-in-eml-2-2-0.html#referencePublication"
		"linkRelationship":"IsReferencedBy"
		},
		
		{
    "@type":"CreativeWork",
	"url": {
	    "@type":"LinkRole",
		"url":"https://www.example.com/articlethatProvidesSupplementalInformation",
		"description":"link to a publication that provides additional information useful to understand the dataset, e.g. analytical procedures, scientific context.",
		"roleName":"https://...",
		"linkRelationship":"Supplements"
		}
]		
}

Still doesn't solve how to assert a 'recommended citation' text string to use when citing the dataset; perhaps a convention that if schema:citation has a text value (not a CreativeWork) then that is assumed to be the recommended citation string.

smrgeoinfo avatar Sep 20 '21 20:09 smrgeoinfo

Should it be "IsReferencedBy" (w/ a "D" at the end of reference)? Source: Scholix (appendix 3.1 - https://zenodo.org/record/1120265)

On Mon, Sep 20, 2021 at 1:17 PM Stephen Richard @.***> wrote:

We had discussed possibly using LinkRole to specify relationship of the object of the citation. here's an example

{ @.": "https://schema.org/", @.": "Dataset", "citation": [{ @.":"CreativeWork", "url": { @.":"LinkRole", "url":"https://www.example.com/articlethatUsesDataset", "description":"link to publication that bases scientific conclusions on analysis using the dataset", "roleName":"https://eml.ecoinformatics.org/whats-new-in-eml-2-2-0.html#usage-citations" "linkRelationship":"IsCitedBy" },

  {
***@***.***":"CreativeWork",

"url": { @.***":"LinkRole", "url":"https://www.example.com/articlethatCommentsOndataset", "description":"link to a publication that comments on/discusses the dataset", "roleName":"https://eml.ecoinformatics.org/whats-new-in-eml-2-2-0.html#referencePublication" "linkRelationship":"IsReferenceBy" },

  {
***@***.***":"CreativeWork",

"url": { @.***":"LinkRole", "url":"https://www.example.com/articlethatProvidesSupplementalInformation", "description":"link to a publication that provides additional information useful to understand the dataset, e.g. analytical procedures, scientific context.", "roleName":"https://...", "linkRelationship":"Supplements" } ] }

Still doesn't solve how to assert a 'recommended citation' text string to use when citing the dataset; perhaps a convention that if schema:citation has a text value (not a CreativeWork) then that is assumed to be the recommended citation string.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ESIPFed/science-on-schema.org/issues/42#issuecomment-923263744, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADFUG4RHJ3235USTMU5YYTDUC6JERANCNFSM4JPTN3BA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

-- Elisha M Wood-Charlson, PhD (she/her) KBase https://kbase.us/ User Engagement Lead; @DOEKBase https://twitter.com/doekbase NMDC http://microbiomedata.org/ @microbiomedata https://twitter.com/MicrobiomeData Lawrence Berkeley National Laboratory LinkedIn http://www.linkedin.com/in/elishawc, Twitter https://twitter.com/ElishaMariePhD (personal)

elishawc avatar Sep 21 '21 15:09 elishawc

Did this get resolved? I am trying to do the same thing in RO-Crate - that is, provide a textual citation for a dataset

ptsefton avatar May 24 '23 06:05 ptsefton

I have suggested so:creditText for textual citations over at the RO-Crate repo: https://github.com/ResearchObject/ro-crate/issues/265

ptsefton avatar May 25 '23 00:05 ptsefton

That's in their "new" area, and it looks like what we need. Good suggestion!

smrgeoinfo avatar May 25 '23 15:05 smrgeoinfo

So, is the SOSO recommendation to use the schema.org creditText field to give a citation string for the dataset? We are working to map SPASE, the metadata system in Heliophysics, following the SOSO guidance and this is not clear. The example given is a 1966 paper for a dataset, and since we are moving away from citing a publications and towards citing the dataset across the sciences, this example is not ...helpful. Perhaps a section could be added on this, specifically how to include a string citation for the dataset and how to include information for closely related publications (e.g. documentation or a publication about the dataset that have DOIs)?

rebeccaringuette avatar Oct 01 '24 21:10 rebeccaringuette