dwc icon indicating copy to clipboard operation
dwc copied to clipboard

Change term - MaterialCitation to MaterialMention

Open Archilegt opened this issue 3 years ago • 17 comments

Term change

  • Submitter: Carlos Martínez
  • Efficacy Justification (why is this change necessary?): To expand the term definition and match it to its intended use, as per the examples provided and subsequent discussion.
  • Demand Justification (if the change is semantic in nature, name at least two organizations that independently need this term): As presented in https://github.com/tdwg/dwc/issues/329#issuecomment-893080620, plus institutional collections, archives, and data managers. It reflects a broader spectrum of physical evidence-based Occurrence sources amenable to digitization and text-mining, and the interest of institutions in adding records of historical holdings to their databases.
  • Stability Justification (what concerns are there that this might affect existing implementations?): The original term MaterialCitation is currently in production. If quickly replaced by MaterialMention, there should be no stability concern.
  • Implications for dwciri: namespace (does this change affect a dwciri term version)?:

Current Term definition: https://dwc.tdwg.org/list/#dwc_MaterialCitation

Proposed attributes of the new term:

  • Term name (in lowerCamelCase for properties, UpperCamelCase for classes): MaterialMention

  • Organized in Class (e.g., Occurrence, Event, Location, Taxon): The proposal is for a Class of basisOfRecord property

  • Definition of the term (normative): An intangible result of mentioning a physical evidence-based Occurrence in a source.

  • Usage comments (recommendations regarding content, etc., not normative): This class constitutes a replacement value to the MaterialCitation value for the controlled vocabulary in the recommendations for basisOfRecord. When importing Darwin Core Archives of literature-based datasets to GBIF, the basisOfRecord should be changed from “Occurrence”, "PreservedSpecimen", "Literature", or “MaterialCitation” to "MaterialMention". To be used even when the original physical evidence of the Occurrence is no longer preserved in a collection. Usage not to overlap with records of an Occurrence without physical evidence (e.g., a human observation taken from field notes or the literature).

  • Examples (not normative): the mention of a fossil specimen from a scientific collection in a taxonomic treatment in a scientific publication. the mention of a preserved specimen in an unpublished collection catalog book. the mention of a material sample in a field note book. the mention of a microscope slide in a card catalog or seller's invoice letter the mention of a jar with multiple specimens in a loan invoice

  • Refines (identifier of the broader term this term refines; normative): None

  • Replaces (identifier of the existing term that would be deprecated and replaced by this term; normative): http://rs.tdwg.org/dwc/terms/MaterialCitation

  • ABCD 2.06 (XPATH of the equivalent term in ABCD or EFG; not normative): Not in ABCD.

Archilegt avatar Aug 05 '21 09:08 Archilegt

@debpaul @deepreef , could you please see if the new proposed definition makes sense and matches the examples? I tried to abduct the definition from the broader context of MaterialMention(s), including but not limited to MaterialCitation(s). I also tried to align the wording of the new definition with other existing DwC definitions.

Archilegt avatar Aug 09 '21 12:08 Archilegt

We really should NOT change term local names when we change definitions. They are not descriptive names or labels -- they are part of a permanent, universally unique identifier for the term (along with the namespace they form the IRI that identifies the term). We can change the labels as much as we want without breaking anything but we should never change the IRIs.

We made this mistake when we changed dwc:individualID to dwc:organismID and we should not do it again.

baskaufs avatar Aug 09 '21 13:08 baskaufs

Here is another reference to the need for a way to distinguish data coming from field notes.

https://github.com/ArctosDB/arctos/issues/2432#issuecomment-906888024

tucotuco avatar Aug 27 '21 03:08 tucotuco

I don't see what the difference is between MaterialCitation and MaterialMention. Effectively they mean the same thing.

qgroom avatar Nov 03 '21 13:11 qgroom

@qgroom , I don't think that they mean the same, at least in Plazi's view and mine. MaterialCitation is constrained to published works. MaterialMention is an abduction encompassing both published and unpublished mentions of specimens.

Archilegt avatar Nov 04 '21 10:11 Archilegt

I strongly believe that MaterialCitation should not be constrained to "published" (sensu stricto) works -- in part because the definition of "published" is highly ambiguous, and in part because there is nothing to be gained by restricting the scope of this class in such a way. If it's important to people to be able to filter on the subset of MaterialCitation instances that appeared in works that meet some pre-defined threshold of "published", then we can add one or more properties to this class to capture this kind of information.

Or, alternatively, we can just define the word "published" in the context MaterialCitation to mean "any form of documented information". But my preference would be to change the words "scholarly publications" in the definition to "documented sources" (or something similar).

But I don't think it makes sense to maintain two separate classes with essentially identical properties, where in many cases it is entirely arbitrary which of the two classes a particular instance belongs.

deepreef avatar Nov 04 '21 17:11 deepreef

@deepreef writes and @qgroom concurs:

I strongly believe that MaterialCitation should not be constrained to "published" (sensu stricto) works -- in part because the definition of "published" is highly ambiguous, and in part because there is nothing to be gained by restricting the scope of this class in such a way.

+1 @deepreef and I think this makes it easier too for everyone to cite grey literature.

debpaul avatar Nov 04 '21 17:11 debpaul

Thanks, @debpaul ! Yeah, from my perspective, it's actually all grey -- just different shades.

deepreef avatar Nov 04 '21 17:11 deepreef

Expanding the definition and examples of MaterialCitation works for me as well. Would it work for Plazi?

Archilegt avatar Nov 04 '21 18:11 Archilegt

Expanding the definition and examples of MaterialCitation works for me as well. Would it work for Plazi?

I know that @myrmoteras expressed an intention to restrict it to "published" works only, but I'm not sure how strongly he feels this way. Perhaps he or someone else from PLAZI can comment?

deepreef avatar Nov 04 '21 18:11 deepreef

I am against the change.

In the taxonomic world we deal with publsihed works that we consider relevant to build up, for example, our naming system. The Codes are very precise about what publications are to allow to make a name available.

A materialCitation is a requirement to create a new species as part of a taxonomic treatment as part of a publication.

This is a subset of what is being discussed here, but a relevant one, because it is also the result of a scientific study.

The act of the identification of the specimen cited in the MaterialCitation is documented and reprdouciible because it does not only include a taxonomic name but also the specifici treatment.

Why not create a new term materialMention in the meaning discussed here? it then cold inlcude anything that mention a specimen, and if one doesn't care also the MaterialCitation.

Finally, there are now over 1M MaterialCitation produced.

myrmoteras avatar Nov 04 '21 18:11 myrmoteras

Thanks, @myrmoteras ! Could you provide a definition for "scholarly publications" as it applies to MateiralCitation? Is peer review required? If so, how would peer review be defined? Or, would you adopt the definition of "publication" in the sense of the ICZN Code (in which case electronic works would need to be registered in ZooBank, among other things)?

Why not create a new term materialMention in the meaning discussed here? it then cold inlcude anything that mention a specimen, and if one doesn't care also the MaterialCitation.

What properties would apply to MaterialCitation but not MaterialMention, and vice-versa? Would it always be obvious whether a particular instance should belong to one class or the other, or would MaterialCitation be a subclass of MaterialMention?

Finally, there are now over 1M MaterialCitation produced.

That's wonderful! But it's not a reason to restrict the scope of instances included within MaterialCitation to only those that appear in works that meet some arbitrary definition of "scholarly publication". I still see no reason why DwC needs a new class or superclass term just to allow the distinction of instances that appear within some definition of "published" work.

Don't misunderstand: I think there is value in being able to identify which MaterialCitation instances appeared in works that meet some prescribed definition of "published". But establishing this at the class level seems to be the wrong approach. Much better would be to establish properties like isPeerReviewed or isICZNpublication or whatever other properties help distinguish the subset of instances you're interested in.

If I started sharing MaterialCitation instances derived from, say, NOAA technical reports, or field notes that are scanned and available online, or a PhD thesis, or an unpublished manuscript, etc. -- would those be rejected as inappropriately classified as MaterialCitation instances? How would they be recognized as such?

deepreef avatar Nov 04 '21 19:11 deepreef

This is getting really interesting. @deepreef , one of the ramifications of this issue, related to your question above, is why to have only properties and classes in the taxonomy of DarwinCore terms, and what is the rank that "leaf" terms belonging to controlled vocabularies should receive. As I see it, one of the main shortcomings of the current taxonomy is exactly that, having just properties and classes, and having "leaf" terms at the ranks of classes. As I am not so experienced with DC, I will make a stop here and ask: Am I missing other categories of terms? If so, which are those? Then, a Darwin Core class is a special category terms used to group sets of terms for convenience. I don't see why basisOfRecord is just a property, while PreservedSpecimen, FossilSpecimen, LivingSpecimen, MaterialSample, Event, HumanObservation, MachineObservation, Taxon, Occurrence, MaterialCitation, which are just values of the basisOfRecord controlled vocabulary and "leaf" terms, are classes. Do the "classes" matter of discussion here group any other DC terms? I'm not seeing that. For solving the current issue, I think that we need to look back and reevaluate whether basisOfRecord is just a property, or a class containing the controlled vocabulary terms as properties, or as "examples". And we also need to think if we need to work towards developing the taxonomy itself.

Archilegt avatar Nov 06 '21 13:11 Archilegt

@Archilegt I don't think there is really any question about the types of terms in Darwin Core, or what basisOrRecord is. The semantics, such as they exist are pretty much laid out in the TDWG Standards Documentation Specification (SDS; Machine-readable documents section 4) and Darwin Core RDF Guide.

Section 4.1.2 of the SDS describes three classes that can apply to vocabulary terms: properties (typed as rdf:Property), classes (typed as rdfs:Class), and controlled vocabulary terms (typed as skos:Concept). There are more details in Section 4, but they aren't important here -- you can refer to them if you want to know more.

Section 2.3.1.4 of the RDF Guide discusses dwc:basisOfRecord. It makes it clear that dwc:basisOfRecord is a property used to designate the type of a resource. The value of properties used to designate type will be classes. This is related to the general semantic meaning of "type", i.e. what class is the thing a member of.

In this way, dwc:basisOfRecord is different from other terms whose values should be controlled vocabulary terms. For most other terms terms expecting controlled values, the values should be concepts designated by either a controlled value string or an IRI defined as part of a SKOS ConceptScheme. For example, the values of dwc:pathway should be controlled value strings from the Darwin Core pathway controlled vocabulary. But because dwc:basisOfRecord is a type-designating property, its values must be classes. The expectation is that those classes will be the ones defined by Darwin Core. Any class in DwC can be a value, which is why those classes like dwc:PreservedSpecimen have been designed. But some of the classes (like dwc:Occurrence) are also used to group terms. That's currently not the case for some of the classes that are just there to be used as basisOfRecord values.

I hope this clears things up a bit.

baskaufs avatar Nov 06 '21 20:11 baskaufs

@Archilegt One followup comment: there are actually not only classes and properties in Darwin Core. There are only those two types of terms in the main vocabulary, but there are also three ratified controlled vocabularies that are also part of Darwin Core. Their terms are SKOS Concepts. You can see those terms via the "Terms" dropdown at the top of any of the DwC standards documents (establishmentMeans, pathway, and degreeOfEstablishment).

baskaufs avatar Nov 06 '21 20:11 baskaufs

@Archilegt: I defer to @baskaufs on the main parts of your post, but with respect to:

For solving the current issue, I think that we need to look back and reevaluate whether basisOfRecord is just a property, or a class containing the controlled vocabulary terms as properties, or as "examples". And we also need to think if we need to work towards developing the taxonomy itself.

There's been a fair bit of recent discussion on this in the Material-Sample group, especially here and here. I'm sure there's also been other discussions in the dwc issue tracker, as well as some of the recorded sessions from TDWG 2020.

deepreef avatar Nov 06 '21 21:11 deepreef

This issue remains controversial and can not be progressed to a recommendation in this milestone of public review.

tucotuco avatar Mar 29 '23 21:03 tucotuco