dwc icon indicating copy to clipboard operation
dwc copied to clipboard

New term - typifiedName

Open tucotuco opened this issue 9 years ago • 72 comments

New Term

Submitter: Markus Döring Justification: Clear separation of the type status and the typified scientific name that is typified by a type specimen, the subject. Looking at how dwc:typeStatus has been used in all of GBIFs specimen data one can see there is the need to express this, but it should better be handled with a term on its own and leave typeStatus for the status of the type only. The term name itself is also used by ABCD: http://wiki.tdwg.org/twiki/bin/view/ABCD/AbcdConcept0603 Organized in Class (e.g., Occurrence, Event, Location, Taxon): Identification Definition: Scientific name of which Organism is a nomenclatural type. Comment: It is recommended to also indicate the typeStatus of the Organism. Refines: None Replaces: None ABCD 2.06: DataSets/DataSet/Units/Unit/SpecimenUnit/NomenclaturalTypeDesignations/NomenclaturalTypeDesignation/TypifiedName

Original comment:

Was https://code.google.com/p/darwincore/issues/detail?id=197

==New Term Recommendation== Submitter: Markus Döring

Justification: Clear separation of the type status and the typified scientific name that is typified by a type specimen, the subject. Looking at how dwc:typeStatus has been used in all of GBIFs specimen data one can see there is the need to express this, but it should better be handled with a term on its own and leave typeStatus for the status of the type only. The term name itself is also used by ABCD: http://wiki.tdwg.org/twiki/bin/view/ABCD/AbcdConcept0603

Definition: The scientific name that is based on the type specimen.

Comment: It is recommended to also indicate the typeStatus of the specimen.

Refines:

Has Domain:

Has Range:

Replaces:

ABCD 2.06: DataSets/DataSet/Units/Unit/SpecimenUnit/NomenclaturalTypeDesignations/NomenclaturalTypeDesignation/TypifiedName

A typical example how typeStatus is used currently is:

ISOTYPE of Polysiphonia amphibolis Womersley

which we could express much better with 2 terms:

dwc:typeStatus=ISOTYPE dwc:typifiedName=Polysiphonia amphibolis Womersley

tucotuco avatar Nov 13 '14 14:11 tucotuco

This proposal needs more evidence for demand (see the Vocabulary Maintenance Specification - Section 3.1). Anybody who is interested in the adoption/change of this term, should comment with their use case below. If demand is not demonstrated by the next annual review of open proposals (late 2020), this proposal will be dismissed.

peterdesmet avatar Oct 21 '19 16:10 peterdesmet

Ping @mdoering

peterdesmet avatar Oct 21 '19 16:10 peterdesmet

There is certainly a need for this and nomenclatural information like this are certainly under worked. Why not add typifiedName to the TypesAndSpecimen extension? currently it has scientificName included, which is not the same thing and easily confused with dwc:scientificName. https://tools.gbif.org/dwca-validator/extension.do?id=gbif:TypesAndSpecimen

qgroom avatar Oct 27 '19 14:10 qgroom

How does this addition work when there are multiple typified names for a single specimen? Currently this would be concatenated into dwc:typeStatus, e.g. https://www.gbif.org/occurrence/1839378016

matdillen avatar Sep 23 '20 15:09 matdillen

I strongly support this and can provide a use case if you need it.

@matdillen , having more than one type specimen is, at least in botany, a very rare occurrence and then the specimen is a syntype or paratype (not really types) of one name and the holotype of a more recent name, so you choose the latter name. My experience though is that when you see more than one typified name for a specimen that is almost always an error.

@qgroom , many people see nomenclatural type designations as Identifications, so in that sense, scientificName in the Types and Specimens Extension seems appropriate. I cannot get a clear picture in my mind whether, if you include both types and selected specimens examined, this might become ambiguous or not, so you may be completely right.

nielsklazenga avatar Sep 24 '20 13:09 nielsklazenga

Just was directed to this thread - I strongly support the need as well!! Being able to do this in ABCD (like this https://www.gbif.org/occurrence/1638363416) but not DwC makes it difficult to effectively cluster collections. If a specimen that is the type of name X but is only listed in GBIF by its current determination Y, clustering by looking for name X would miss that specimen.

RRabeler avatar Feb 24 '21 01:02 RRabeler

We have semantics in TaxonWorks that would require this one-to many relationship between collection object and taxonomic name: https://rdoc.taxonworks.org/TypeMaterial.html.

mjy avatar Mar 25 '21 19:03 mjy

@peterdesmet I note we missed the 2020 review period, but clearly there's interest in moving this conversation forward. I"ve asked @RRabeler to get other colleagues to weigh in too.

debpaul avatar Mar 25 '21 20:03 debpaul

I guess I'm a little confused. Throughout most of DwC information is captured in one place, not multiple places. We're talking about a relationship between a specimen and a scientificName, which already exists within the Identification class. At the moment, the typeStatus term is (correctly) grouped with that class. Presumably, each instance of Identification joins one instance of a specimen (technically MaterialSample, but I suppose many people would directly link it to an instance of Occurrence, as instances of that class are often used as proxies for the associated MaterialSample). Thus, the expression "specimen X typifies name Q" is easily (and appropriately) captured within an instance of Identification.

We have a habit in biodiversity informatics of defining good terms and clustering them into good classes, then not using them broadly (I'm looking at you, MaterialSample, ResourceRelationship, MeasurementOrFact). Obviously, some people use these classes and associated terms (especially the last), but they are, in my opinion, grossly underutilized. I think the Identification class and its terms is another example that should be built in to our information exchange systems -- but that's a rant for another issue/thread.

So, the issue with a specimen and a name as a type is not an inherent property of either the specimen or the name -- it's a property of an assertion joining the specimen and the name, which is why it's correctly clustered within the Identification class. But even if you ignore the data model normalization thing, and flatten a record into a DwCA dataset, isn't the information for typifiedName already represented via all the terms from the Taxon class also included with that row?

Suppose I have a type specimen in my collection, and I expose it via a DwCA datset. One property of that record is materialSampleID, or even occurrenceID, or at least the DwC triplet of ICode+CCode+CatNumber -- so we know what specimen we're talking about. Another property is typeStatus, so we can capture ISOTYPE. And another set of properties are all the Taxon class terms -- including sientificName. So don't we already have typifiedName represented in the form of scientificName for the same record (representing a PreservedSpecimen) that includes a value for typeStatus?

The problem, I assume, is that datasets will represent a MaterialSample record (camouflaged as an Occurrence record) where the associated scientificName is represented as the "current accepted" taxon, rather than the name for which the specimen plays its role as typeStatus (and then include the typified name within the typeStatus field). Thus, the addition of typifiedName would allow the value of scientificName (and other terms of the Taxon class) to reflect the current taxonomic identification of the specimen, while also indicating that the specimen also serves as a type of a different taxonomic name. I get that -- but isn't the better solution to educate content providers that they should be using acceptedNameUsage for this purpose? Or better yet -- start actually using the Identification class for what it was intended (i.e., allowing a single specimen to be represented with multiple taxon identifications, with typeStatus applied only for the one scientificName for which it actually typifies.

I do understand that we live in the real world, where people provide content in highly flattened/denormalized form, and this leads to crude efforts to overcome the inherent limitations of doing so (such as adding information about the typified name within the typeStatus property). In this context, I guess it makes sense to add this new term -- but from my perspective, the term would exist only to further enable us to avoid leveraging the capabilities already built into the DwC standard, and perhaps represents a step backwards from fully realizing those existing capabilities within our information exchange systems.

deepreef avatar Mar 26 '21 17:03 deepreef

I agree with @deepreef that the "bits" of data can all be shunted into a format that is sharable as is. I suspect though, that 'typifiedName' represents a need for teasing out semantics?

Perhaps our model in TaxonWorks will further confuse things help draw this to a conclusion. Capitalized words are Classes.

TaxonDetermination

  • Links a Specimen to an OTU (biological concept)
  • Subjective
  • Model contains no rules other than exact duplication is prevented (no point in asserting the same thing twice)
  • Can NOT be used to infer TypeMaterial status (it links a Specimen to a concept, not a Name)

TypeMaterial

  • Links a Specimen to a TaxonName (nomenclatural concept)
  • Objective
  • Model contains semantic rules that reflect governed rules of nomenclature
  • Only allows type types that are goverened by the rules of nomenclature (for example isotype does not belong here)
  • Can be used to infer the presence of a TaxonDetermination (if one assumes there was an OTU concept behind the Name as it was made available, a pretty safe assumption)

Instances of these data can be fed to DwC as is, I'd have to look at specifics to dig into where we put the bits.

mjy avatar Mar 26 '21 18:03 mjy

@deepreef I understand those worries, but DwC (and other standards like ABCD or EML too) has lots of terms only existing to allow flat views. acceptedNameUsage really is the scientificName of the Taxon record linked via acceptedNameUsageID. kingdom, phylum and the other flat ranks are similar convenience terms. The same is true for other location and temporal terms that should sit on an Event or even Location instance. By far the vast majority of DwC use is flat. It's what was (once?) called Simple Darwin Core. No doubt we should be moving to a more relational world, but I think the proposed term makes a lot of sense for the current use of DwC.

mdoering avatar Mar 26 '21 19:03 mdoering

I think the issue is that, currently, the usage of typeStatus is different when used in the Occurrence Core than when used in the Identification History or the Types and Specimen Extensions. While in the extensions typeStatus is just the kind of type, in the Occurrence Core it also includes the typified name and other information, so an entire Identification if you see it that way.

Despite being in the Identification class, the way typeStatus is defined in Darwin Core makes it one of those terms that only exist to allow flat views (free after @mdoering just above) and only suitable for use in the Occurrence Core. So, people who want to deliver typification in the Identification History extension should support this proposal to split off the typified name from the type status (or kind of type).

I do not really want to go into why nomenclatural type designations are not Identifications, as that is not what matters here. What matters is that, if we treat them as Identifications, they are not (necessarily) the same Identification as the current Identification, which we deliver in the Occurrence Core. So we basically want to have two Identifications in the Occurrence Core, the current Identification and the nomenclatural type designation Identification. In order to allow consistent use of (a redefined) typeStatus, we also need to have a way to distinguish between the scientificName from the current Identification and that from the nomenclatural type designation Identification in the Occurrence Core record, which is where the proposed typifiedName comes in.

nielsklazenga avatar Mar 26 '21 22:03 nielsklazenga

@mdoering : OK, fair points! I just wanted to make sure I understood. So am I correct that the problem is that people represent type specimens with names that are different from what the specimens typify, and that for whatever reason they're not using the acceptedNameUsage term to capture the current name, and scientificName for the original typified name? (there is no requirement that acceptedNameUsageID must be populated in order to provide a value for acceptedNameUsage). Also, most of "flattened" terms aren't redundant to other terms/structures already in DwC, and/or weren't established with the explicit intention of accommodating flattened representations of the data. But if you feel there is a need for this term as an additional flat-friendly way of capturing information that people are presenting in typeStatus or some other incorrect way, then I wouldn't push back against it. I just wanted to make sure I understood the need. Also... what class would the term belong to? Would it be best to include within Identification class, or the MaterialSample classe? (Please, Please not Occurrence!!)

@mjy :

Model contains no rules other than exact duplication is prevented (no point in asserting the same thing twice)

If you mean "exact" as in same determiner, same date, same taxon; then I agree. But we allow multiple determiners to assert the same taxon on the same specimen; and also the same determiner to assert the same taxon on the same specimen on different dates (why throw away information). But I agree that same determiner, same specimen, same taxon, same date is redundant.

TypeMaterial Links a Specimen to a TaxonName (nomenclatural concept) Objective

That's how I used to model it, and that's how a lot of nomenclators model it; but there is enough grey area in this space that I finally had to acknowledge that "specimen is type of taxon" is not as "objective" we all wish it were, and really requires an "accordingTo" reference, just like any other assertion.

@nielsklazenga : < in the Occurrence Core it also includes the typified name and other information, so an entire Identification if you see it that way.

OK, I didn't realize this was a "thing". We represent (or at least intend to represent) our type specimens using scientificName for the typified name. If we want to represent the "current" interpretation of the name, we use acceptedNameUsage. But I agree if people are mashing additional information (like the typified name) into typeStatus, and they are unable (or unwilling) to represent it using more appropriate terms, then maybe typifiedName could be useful. But if this term is added, will people actually use it?

Despite being in the Identification class, the way typeStatus is defined in Darwin Core makes it one of those terms that only exist to allow flat views (free after @mdoering just above) and only suitable for use in the Occurrence Core.

That certainly is not what that term was originally intended for. I was unaware that the "Examples" in the DwC reference were updated to what they are now. It used to be for terms like "Holotype", "Paratype", "Lectotype", etc. But now that I see the Examples as given in the quick reference guide, I understand why it's a problem. I must have missed the discussion that updated those Examples, because I would have strongly objected to that. But if that's what people are doing, and that's what the community really thinks is now appropriate for this term, then I agree that adding something like typifiedName is the lesser of evils. It feels like a step backwards, but I guess we can't always move forward.

I do not really want to go into why nomenclatural type designations are not Identifications

Technically not Identifications, but that's by far the closest Class in DwC to which type designations belong. They're not properties of specimens (MaterialSample) or of Taxon, because they are asserted statements, not inherent facts.

Anyway, now that I see how the "Examples" for dwc:typeStatus have been updated to say, I understand why this problem exists. And if adding the term typifiedName can solve problems in the near term, I would support it.

deepreef avatar Mar 27 '21 00:03 deepreef

We represent (or at least intend to represent) our type specimens using scientificName for the typified name. If we want to represent the "current" interpretation of the name, we use acceptedNameUsage.

I wonder how common that is. I always assumed scientificName should be the current determination. @timrobertson100 I believe GBIF expects that too. It probably does not make much difference as long as the accepted name and typified name are both under the same taxon in GBIF.

mdoering avatar Mar 27 '21 00:03 mdoering

that "specimen is type of taxon" is not as "objective" we all wish it were, and really requires an "accordingTo" reference, just like any other assertion.

But this is precisely what we are not doing. Specimen is Type of TaxonName, not Taxon (== OTU) concept. Where does this fail or become a gray area? If it does fail then the code of nomenclature doesn't work AFAIK.

mjy avatar Mar 27 '21 00:03 mjy

do not really want to go into why nomenclatural type designations are not Identifications

I do! They are not, and we need special treatment of these facts lest the be confused for something they are not. Object != subjective.

mjy avatar Mar 27 '21 00:03 mjy

@mdoering :

I wonder how common that is.

I had always assumed that every Museum worked this way; but maybe not? Also, despite what I write below, the closest thing there is in our world to an "objective" determination is the one that links a scientificName to a name-bearing type. From that perspective, it seems to me that scientificName most correctly should be represented as the "typifiedName", whenever a name-bearing type is in play (not so much for Paratypes).

On the other hand...

@mjy

If it does fail then the code of nomenclature doesn't work AFAIK.

Yup... sad but true. Many (most?) names don't even have types (at least not in zoology). The ICZN Code has rules for retroactively designating types, but this is generally only done when there is a specific need to do so. Indeed the Code expressly prohibits designating neotypes unless there is a specific taxonomic ambiguity that needs to be resolved. The typical example is that a series of specimens known to be available to and examined by the author of the name are regarded as a syntype series. In some cases, one of those syntypes is elevated to a lectotype. And in very specific cases, neotypes are designated. Even in modern original description, the author "fixes" the type through a nomenclatural action (this was only explicitly required by the ICZN Code after 2000). So we have all kinds of situations where at one point in time a taxonomist retroactively recognizes a syntype series for an old name. Then later someone elevates one of those specimens to the status of lectotype. Then maybe someone else who is unaware of the lectotype designation picks another one of the syntype series and declares it to be the lectotype. Or, sometimes someone designates a neotype, then an original specimen (holotype or syntype series) is discovered.

Thankfully, these situations aren't common -- but they're not so rare that we can just sweep them into the dustbin of "edge case" either. As much as the Code likes to make this stuff as objective as possible, it turns out that a non-trivial number of cases involve some level of subjective interpretation (e.g., "Did the author have access to this specimen prior to establishing the name, in which case it can be considered part of the syntype series?")

That's why I had to (reluctantly) abandon my hopes and dreams to treat the relationship between a name and its type as an objective fact, as opposed to an assertion with an accordingTo.

The meaning of dwc:typeStatus, to me at least, is not so much a statement "this specimen is the type specimen of that name"; but rather something more like "the label for this specimen includes the word 'Holotype' on it, in association with this name". That's probably the most compelling evidence we have that a particular specimen is, in fact, a type specimen for a name. But I've encountered more than a few cases where the label was wrong. And not just for very old names, either.

And, as I said, Identification is not exactly the same thing as type fixation, but it's the closest thing DwC has to it.

deepreef avatar Mar 27 '21 00:03 deepreef

@deepreef

I think you've identified many cases where it's hard (or impossible) to make an assertion of a specific type. This in my mind is not the same thing as saying we shouldn't make certain assertions when they are possible, or treating our assertions specifically to mean one thing. For the record we can stack as many Citations on either class of facts I referenced (and any class of fact) in TaxonWorks, so we can precisely reifiy our data based on your view, but, more importantly, we can enforce the rules if we need to, the same can not be said if strong assertions are not made.

Frankly it feels like you've made a strong argument for abandoning nomenclature all together, which in my mind is not necessarily a bad thing ;).

mjy avatar Mar 27 '21 01:03 mjy

@deepreef, we seem to be mostly in agreement.

We represent (or at least intend to represent) our type specimens using scientificName for the typified name. If we want to represent the "current" interpretation of the name, we use acceptedNameUsage.

In the past, I have (sort of) proposed the opposite, using originalNameUsage for the typified name, which is marginally less inappropriate, but of which I am still not in favour now (and was not really then). acceptedNameUsage and originalNameUsage are taxa, which do not have types. The "current" determination of a specimen and the "current" interpretation of a name are different things. The problem here is that scientificName is such a crappy name for a property, so its use in the Occurrence Core can be ambiguous (do not read this as a suggestion that this should be changed).

@mjy

I do! They are not, and we need special treatment of these facts lest they be confused for something they are not. Object != subjective.

I agree. I think part of the problem is that typification is often confounded with annotations on specimens that the specimen is some kind of type for a scientific name (which putting it in the Identification class encourages). As you have already pointed out above, typification is to names, not taxa (like identifications are); also they are done in publications, not as annotations on specimens. Nomenclatural type designations are facts. That does not mean everybody gets their facts straight or even agrees what the facts are. However, the assertion will be in the annotation and will be in whether the specimen that is being annotated is the same as the specimen cited in the publication, or in the application of the rules of the relevant code. This is entirely different from the assertion that a specimen belongs to a taxonomic group, which is what an identification is. Putting typification-related terms in the Identification class is confounding the vehicle (annotation) with what it transports (identification or typification).

I do not think we really need a Nomenclatural Type Designation class in Darwin Core. A DwCA extension would be nice though, as I would have real problems with delivering typeStatus in the Identification History extension. I am perfectly happy to keep delivering them in the Occurrence Core, although there are relatively rare occasions where a specimen may be a syntype of one name and a holotype or isotype of a more recent name (I just deliver the latter in the Occurrence Core).

I also do not think we should abandon nomenclature altogether, although it might be best if some people would attach some less importance to it. Rather, people should stop confuddling taxa and their labels and realise that nomenclature only applies to (a certain type of) the latter.

All of this has little bearing on this proposal, as regardless of how you want to treat nomenclatural type designations, we still need to separate the typified name from the type of type.

nielsklazenga avatar Mar 27 '21 03:03 nielsklazenga

This is a bit of an aside, but all this discussion makes me pity the person just trying to publish their specimen data. So much of what has been written here is undocumented.

Take for example these terms...

acceptedNameUsage: The full name, with authorship and date information if known, of the currently valid (zoological) or accepted (botanical) taxon.

scientificName: The full scientific name, with authorship and date information if known. When forming part of an Identification, this should be the name in lowest level taxonomic rank that can be determined. This term should not contain identification qualifications, which should instead be supplied in the IdentificationQualifier term.

Is there actually any substantive difference in the definition of these terms? All I can see is that you can put invalid/unaccepted names into scientificName and you can put identification qualifiers into acceptedNameUsage. Clearly, that was not the intension, but it is undocumented. Accepted names are only useful when you know who accepted them and in this case there is no link to a publication, so the term is moot, it just means accepted within the context of this dataset.

originalNameUsage: The taxon name, with authorship and date information if known, as it originally appeared when first established under the rules of the associated nomenclaturalCode. The basionym (botany) or basonym (bacteriology) of the scientificName or the senior/earlier homonym for replaced names.

Apparently originalNameUsage actually has nothing to do with occurrence data at all. How often, and why, would anyone go to the trouble of finding out the basionym of the scientificName for an occurrence, when it might not even be the accepted name?

Whereas...

  • typifiedName: is actually highly useful and has a clear definition different from the definitions preceding.

I was struck recently by the clean and clear documentation of Schema.org, with rich descriptions and examples of real life data. If there really is an alternative to typifiedName then it would need to be properly documented within the standard along with examples. Using extensions is always a burden in maintenance and for users. Therefore, it has to be easy to implement or only the minority of people will use it and it becomes redundant.

Typification data are some of the worst kept in our domain and I am really keen to see an improvement.

qgroom avatar Mar 27 '21 07:03 qgroom

OK, we seem to be discussing two things:

  1. How to model this stuff in an ideal way (i.e., whether or not typeStatus/typification are facts about names/specimens, or assertions about the relationship between names and specimens)
  2. How to solve a practical issue related to parsing overloaded content in typeStatus, resulting from the (heinous) "Examples" given for the DwC term typeStatus.

My sense is that @mdoering raised this issue in the context of # 2; but several of us seem more interested in discussing # 1. For the record, as already stated, I support the proposal by @mdoering for # 2. But I see it as a "band aid" solution to a problem of content misrepresentation resulting from peculiar "Examples" for the DwC term typeStatus. But we will eventually need reconciliation of # 1, -- either via the TNC-TCS group (if typificaiton is in scope), or somewhere else in DwC/TDWG-land.

@mjy

Frankly it feels like you've made a strong argument for abandoning nomenclature all together, which in my mind is not necessarily a bad thing ;).

Yeah, sometimes I feel that way too. But I think there is value in tracking nomenclatural acts as governed by major Codes, and as manifest through TNUs. I also think there is value in tracking treatments of taxonomic concepts/circumscriptions as treatments, also manifest through TNUs. And, I think there is value in tracking organisms, as manifest through both MaterialSample instances (i.e., specimens) and in-situ observations. The relationships between names and concepts/circumscriptions can be effectively captured through TNUs directly (TNC-TCS group working on this now). The relationships between names/concepts/circumscriptions and corresponding Organism/MaterialSample instances can be capture via Identification instances.

As I've already said, I think it's a mistake to frame the status of a specimen as a nomenclatural "type" as a direct property of either the name, or of the specimen -- the typeStatus represents a relationship between an instance of a name and an instance of a specimen. Thus, without creating a new class specifically to track instances of "typification", it's a very natural fit to track typifications/typeStatus via instances of Identification -- because instances of Identification and instances of typification both represent the relationship between names and specimens. This is why I say that the Identification class is not the perfect way to represent typeStatus/typification information

I'd like to explore this more, but I fear we'd be drifting too far from the issue at hand. Perhaps this is worth spawning a new issue?

@nielsklazenga : Yes, I think we're pretty close to agreement on the existing DwC terms in the Taxon class. Those came about in a context where the "basis of record" for a Taxon instance was intentionally vague and open, because the community had not yet settled on how to sort out taxon names and concepts. Of course, we still have not sorted that out, but it seems like we're making progress in the TNC-TCS space. My hope is that the product of that effort will wholesale supersede the current Taxon terms in DwC.

As you have already pointed out above, typification is to names, not taxa

Agreed! This is why placing typeStatus within the Identification class is not perfect (but better than the existing alternatives).

also they are done in publications, not as annotations on specimens.

Well... sort of. Under the nomenclatural Codes, typifications are events/acts that occur within publications (which is why they are best framed as assertions). However, in practice -- and even to some extent in the sense of the Codes -- the specimen label annotation over-rides what appears in publications. I can provide specific examples of this.

I would have real problems with delivering typeStatus in the Identification History extension.

Can you explain why you would have problems with this? Included among the Identification History are the instances where the publication that fixed the type also provided an Identification of the type specimen. Those are the Identification instances (i.e., the ones where type fixation occurs) where typeStatus should be populated. Obviously, that's not how the vast majority of DwCA content is created, which is why I support the need for a band-aid typifiedName term. but if you're talking about optimizing the data model for how typification actually happens, it's a pretty damn good fit (much better than, say "occurrence-as-specimen", which is also pretty rampant among DwCA content).

@qgroom : I definitely agree with the need for better documentation. I can comment a little on why there was (and, I think, still is) a need for three terms:

scientificName - intended to capture the name as labelled for a specimen or occurrence. This is implied to be whatever the latest "Identification" instance represents the specimen to be labelled as.

acceptedNameUsage - intended in cases where a content provider is aware that a given specimen is labelled with a name that is not consistent with the taxonomic perspective of the content provider. The reality is that there are many discrepancies within collections between what the label says for the name (usually the most recent identification from an expert who examined the specimen, which might have been decades ago), and the name that the content managers believe is the correct scientific context to apply in the modern context. This was to pave the way for collection managers to stop the highly undesirable practice of updating the taxon representation of a specimen based on a taxonomic change that did not involve anyone actually examining the specimen. We want to be able to represent the specimen both "as identified", and "as we would interpret the correct taxon name today".

originalNameUsage -- essentially intended to capture the basionym.

The intention of the "Usage" suffix on these and other terms in the Taxon class was to shift from the highly problematic issues associated with equating names & concepts (as alluded to by @nielsklazenga), and instead paving the way to a TNU-based way of modelling taxonomic information. TCS1 was intended to provide a mechanism to enable that; but it never "took". Perhaps we can do better with TCS2.

Typification data are some of the worst kept in our domain and I am really keen to see an improvement.

I think this is something we can ALL agree with!!!

deepreef avatar Mar 27 '21 22:03 deepreef

This discussion highlights that typifiedName should sit together with typeStatus on the dwc:Identification class. Something not defined in the original proposal.

mdoering avatar Mar 27 '21 22:03 mdoering

Just circling back to the definition. Could we make it something like:

Scientific name of which the specimen is a nomenclatural type

?

I do not think Darwin Core needs to explain what nomenclatural types are and 'based on' definitely does not cover it.

nielsklazenga avatar Mar 28 '21 01:03 nielsklazenga

@deepreef Couldn't we put your explanations of scientificName, acceptedNameUsage and originalNameUsage into the comments of the Darwin Core terms? We need to capture this for the average user. @tucotuco should we create a separate issue?

@mdoering and all, the dwc:Identification class doesn't contain a name that the identification refers to. Indeed, the identification is to a Taxon, rather than a name. The documentation sorts of hints that it refers to scientificName. However, if you add typifiedName to the dwc:Identification class then some "identifications" are going to refer to names outside the dwc:Identification class and some to names within the dwc:Identification class.

For me typeStatus and typifiedName are properties of the specimen not the Taxon

qgroom avatar Mar 28 '21 08:03 qgroom

Well... those are the definitions that are in my head now -- not necessarily the definitions that were in my head when I provided those terms to @tucotuco back when they were first added. I'm happy to capture this in whatever form is appropriate to include in online resources, provided others on this thread agree that they make sense. I wouldn't want to mess things up in the same way that the comments for typeStatus effectively changed the original purpose of the field. But if hardly anyone uses these terms anyway, maybe its not so important. I'd also like to hear from @nielsklazenga and @mjy and @mdoering and others who spend a lot of time dealing with taxonomic data to see if these rough definitions seem OK.

the dwc:Identification class doesn't contain a name that the identification refers to. Indeed, the identification is to a Taxon, rather than a name

My understanding is that originally, instances of in the Taxon class were intentionally defined broadly. They could be interpreted as concept-like things, or they could be interpreted as name-like things. Because almost no two taxonomic databases share perfect parity on their respective instances, it made more sense for the definition of a Taxon instance to be vaguely correct, instead of precisely wrong. My hope is that the entire DwC Taxon class and associated terms will be wholesale replaced by whatever comes out of the TCS2 exercise, so I wouldn't spend too much time invested in tweaking those existing terms right now.

My understanding is that occurrenceID (originally) / materialSampleID (now) and taxonID were not repeated within the Identification class because they are effectively "foreign keys" to those other respective classes. Similar to how locationID is not included within the Event class. They're implied to be in there whenever content is provided in some sort of DwC format.

In contrast to @qgroom, I still support the inclusion of both taxonStatus and typifiedName within the Identification class -- not so much because they belong there, but because they fit belong in other existing classes even less. But I would certainly agree that having them as terms within the materialSample class makes MUCH more sense than within the Taxon class. Nevertheless, I still see them within the Identification class as being the least of evils.

deepreef avatar Mar 28 '21 10:03 deepreef

I am happy to put up with keeping typeStatus in the Identification class, although I think it would be more appropriately placed in PreservedSpecimen and/or FossilSpecimen, which I think are equivalent to Occurrence (so in Occurrence). I have always thought that MaterialSample is for things that have been derived from specimens, like molecular isolates, rather than the specimen itself.

nielsklazenga avatar Mar 28 '21 11:03 nielsklazenga

@deepreef Couldn't we put your explanations of scientificName, acceptedNameUsage and originalNameUsage into the comments of the Darwin Core terms? We need to capture this for the average user. @tucotuco should we create a separate issue?

Yes, please. One issue for each term change recommendation. Choose the Term Change template when creating the issues.

tucotuco avatar Mar 28 '21 16:03 tucotuco

the dwc:Identification class doesn't contain a name that the identification refers to. Indeed, the identification is to a Taxon, rather than a name. The documentation sorts of hints that it refers to scientificName. However, if you add typifiedName to the dwc:Identification class then some "identifications" are going to refer to names outside the dwc:Identification class and some to names within the dwc:Identification class.

For me typeStatus and typifiedName are properties of the specimen not the Taxon

@qgroom dwc:typeStatus currently is defined as an Identification term. Would it not be awkward to place typifiedName somewhere else? And I agree with @deepreef that Identification is the closest we got to a NomenclaturalEvent type.

Note that the identification extension (has to) flattens ID and Taxon and therefore contains scientificName: https://rs.gbif.org/extension/dwc/identification.xml

mdoering avatar Mar 28 '21 19:03 mdoering

@mdoering @deepreef I'm not saying typifiedName should not go in dwc:Identification, but once you add a taxonomic name to this class people are not going to know where the identified name is, except in the case of type specimens. So it will either need to be much better documented, or an identifiedName will have to be added to dwc:Identification.

qgroom avatar Mar 28 '21 19:03 qgroom

@qgroom : Ah! Sorry for my misunderstanding, and I see your point. People will no doubt confuse the value shown in typifiedName as being the name of the Identified organism. In this sense, I see the (larger) problem: typifiedName is in some ways less a property of Identification than typeStatus is, and perhaps is better suited for being a property of MaterialSample. But then it becomes decoupled from the corresponding typeStatus. Three options come to mind:

  1. Create a new Typification Class, with properties something like:
  • typificationID [unique identifier for the Typification instance]
  • typifiedNameID [points to an instance of Taxon]
  • typifiedName [textual representation of the typified name, presumably the same value as either scientificName or originalNameUsage of the corresponding Taxon instance]
  • typeSpecimenID [points to an instance of MaterialSample]
  • typeSpecimen [textual representation of the type specimen -- not sure what this would be, as the DWC triplet is connected to the Occurrence Class]
  • typeNameID [points to an instance of the Taxon, representing the type species of a genus, or type genus of a family]
  • typeName [textual representation of the type name, presumably the same value as either scientificName or originalNameUsage of the corresponding Taxon instance]
  • typificationRemarks [yadda yadda yadda]
  1. Punt on the problem until TCS2 can replace all of the Taxon-Class terms in DWC and assume that it will include a Typification class something like the above.

  2. Create the temporary band-aid solution proposed in this issue, and simply add typifiedName to the Identification class along side typeStatus, and hope people don't confuse its purpose.

  3. Create the temporary band-aid solution proposed in this issue, and simply add typifiedName to the MaterialSample class and also move typeStatus to this class, and hope people don't get confused.

  4. Remove the problematic "Examples" from the documentation for typeStatus, and recommend instead a controlled vocabulary for values like "Holotype", "Paratype", "Lectotype", "Isotype", etc., and don't bother with this new term.

There are other variants of these as well, but none of them is great. Personally, I think # 5 makes the most sense as the temporary solution until # 2 comes to fruition. Otherwise, I think # 3 is probably the least of evils.

Why can't modelling taxonomic information be easy?

deepreef avatar Mar 28 '21 21:03 deepreef