general icon indicating copy to clipboard operation
general copied to clipboard

Define objective rules for taxon concept identity

Open mdoering opened this issue 7 years ago • 129 comments

Define rules for a stable taxonID. Understanding when a taxon changes sufficiently to warrant an identifier change

mdoering avatar May 04 '17 09:05 mdoering

To define what is meant by a "taxon" instance (to which a taxonID is assigned), we need to establish what are the "core proprties" of an instance of a "taxon", whereby if one of the core properties changes, a new taxonID must be issued. I think it's best to narrow the scope of those properties to representing the "contents" of a taxon, rather than the combination of contents and "context". "Contents" in this sense are the items contained within the circumscription of a taxon. For example, a taxon representing a genus would be defined by the set of species contained within it. For example, two different assertions of a genus contain different sets of species: Aus sensu Smith contains (Aus bus+Aus cus) with species "dus" placed in genus "Xus"; whereas Aus sensu Jones contains (Aus bus+Aus cus+Aus dus); then Aus sensu Smith would have a different taxonID from Aus sensu Jones because they have different contents. "Context" in this sense means placement within a hierarchical classification. Changing the context of a taxon instance should not cause a change in taxonID. For example, if Smith and Jones both assert the same contents of the genus Aus (e.g., A.bus+A.cus+A.dus), but Smith places the genus in the family Aiidae, and Jones places the genus in the family Xiidae, we do not need a different taxonID to represent Aus sensu Smith and Aus sensu Jones. Logically, this means that for a species-level concept, if the circumscriptons of both Smith and Jones for the species "bus" are the same (i.e., same heterotypic synonymy), then they have the same taxonID even if Jones treats it as "Aus bus" and Smith treats it as "Xus bus". This needs to be fleshed out in a full document.

deepreef avatar May 04 '17 15:05 deepreef

I agree with excluding the position in the classification, the "context", from the taxon concept identity. A for the included species we should find a way to allow newly described species to be added to a genus without change its identity as long as the species were not moved from another genus. See also

As for a final output I agree this needs to be written up somewhere else, probably as part of the API documentation. But in general I would like to get an agreement on the key points in these issues first instead of creating lots of documents that tend to consume a lot of overhead just for styling and explaining the context.

mdoering avatar May 04 '17 21:05 mdoering

Yes, exactly! Dave and I discussed this at some length at Woods hole. What it boils down to is this: When a new species is described, what would its type specimen have been identified as prior to the new species description? If it would have been identified as an earlier-named species, then what we have is a case where one larger species was split into two smaller species. As such, the circumscription of the genus doesn't change. On the other hand, in the cases of "brand new" species, which would not have have had ANY taxonomic identity prior to their description, then the concept for the genus would need to change. Obviously, this is subjective in many cases, and not obvious. But from an informatics perspective, I think the cleanest answer involves how the link between names and their corresponding name-bearing types are made. I suggested to Dave that we could have a "retroactive identification" system, whereby it can be asserted that the type specimen of a new species would have been identified as species "X" prior to the new species description. This can be proxied through assertions of heterotypic synonymy, if we don't want to get all the way doen to identifications of specimens. I will take some time this weekend to come up with a better diagram than what is shown above. I actually started one in Woods Hole, so I will finish that and then share it.

deepreef avatar May 04 '17 21:05 deepreef

By the way, if we can address this informatically, then we will have also created a REALLY valuable tool to distinguish "true" new species from "split" new species. An analogy of the difference is the two kinds of "Gaps" in Col (actual taxonomic gaps, vs. synonym name gaps). This is important because it distinguishes cases of new species that increase our understanding of the scope of biodiversity, vs. cases of drawing new lines within our already-existing understanding of the scope of biodiversity. We've never been able to do this before.

deepreef avatar May 04 '17 22:05 deepreef

Would we want a genus concept to change when a brand new species gets added? It would mean genus identifiers change quite a lot over time and we lose stability. It might be more useful to restrict changes of genus concepts to true splits & merges of genera, ignoring the exact amount of included species for the most part and focus on the genus types as we discussed at some point. This needs real world examples to test

mdoering avatar May 04 '17 22:05 mdoering

It depends on how much you want to reflect reality, and it also depends on what you mean by stability. The undortunate reality is that the meaning of a genus-level taxon concept DOES change when a truely new species is added. However, if we want less precise but more stable taxon identifiers for genera, then we can treat them the same way as species. That is, instead of defining them by the circumscription of all individuals, we can limit the definition to be circumscriptions of types (stype species for genus concepts, and type specimens for species concepts). Unfortunately, as we discovered in our discussions at Woods Hole, we lose important information about taxa when we fail to distinguish the case of one species-level taxon that is split into two, vis a brand new species being added (impetus for the diagram in the photo you included above).

Also, "stability" is actually INCREASED with increased precision, because there is less subjectivity in the definition. The problem isn't a loss of stability, the problem is a proliferation of subtle variants (e.g., Aus Smith sec. Smith vs. Aus Smith sec. Jones). All of these variants are themselves stable; but they confuse matters because we have no good way to reflect the differences in meaning between two precisely-defined genus-level taxon concepts.

deepreef avatar May 05 '17 20:05 deepreef

Right, the genus concept changes when a new species is added when you look at the included species. But is this really useful for anyone?

It seems to me it is rather about delimiting a genus to other genera that is important here to define the concept. Merging and splitting again. For example the genus Acacia can be referred to as the concept sensu latu including all species nowadays in Vachellia or sensu strictu when you also acknowledge the existence of Vachellia.

mdoering avatar May 08 '17 10:05 mdoering

Personally, I'm happy with defining a taxon by the set of "types" it contains. That is, a "species" concept represents the sum of the species-group protonyms (as proxies for type specimens) assigned to it as heterotypic synonyms, and a "genus" concept is the set of genus-group protonyms (as proxies for type species) assigned to it as heterotypic synonyms. To me, that solves 80% of the problem with 20% of the effort. However, as we discussed in Woods Hole, this completely misses the ability to descern the "sensu lato/sensu stricto" cases where an existing species is split into two. That is, no way to distinguish between "Aus bus Smith sec. Smith" (sensu lato) from "Aus bus Smith sec Jones" (sensu stricto) -- when Jones splits Aus bus into Aus bus Smith sec. Jones and Aus dus Jones sec Jones. The same applies to all ranks (Genus and above).

Like I said, limiting it to heterotypic synonymy gets 80% of the job done with 20% of the effort. If we want to go beyond that, I think it would be better handled by a system of "RelationshipAssertions" (sensu TCS).

deepreef avatar May 08 '17 20:05 deepreef

Three implementations dealing with tracking taxon concept changes:

mdoering avatar May 09 '17 15:05 mdoering

Should the identity stay if just the name changes? E.g. some of the synonyms gets accepted or if the name changes its rank, e.g. a species will be considered a subspecies now? Type and concept wise these are the same so the identifier should not change, correct?

mdoering avatar May 09 '17 15:05 mdoering

As we discussed already, but too briefly in Woods Hole, I think that defining a taxon (=concept) by its content is not enough or even may be useless.

A taxon (e.g. genus) has its own definition. Adding or removing a species that fits with its definition does not change the taxon definition: it remains the same while its sum has changed! In other terms trying to define a taxon by the sum of its species is not so useful: different sums could lead to the same taxon and then the same UI ! which is not what we want I suppose. I might be wrong but I don’t see this practicable in the issue of UIs. Additionally (even if it would be probably the best to do) I don’t think that we going to suggest changing the UI each time we are adding/removing a species to a genus.

In reverse, with its own definition a taxon carries a series of implicite characters that link it into a special place into the hierarchy (classification of phylogeny). If you change the place where you hang this taxon, you change all these implicit characters that define the taxon = you change the full/complete definition of the taxon -> you change the taxon.

I feel that these are the changes which are really necessary to tract, the ones that are important for CoL.

Not sure I’m clear here ;-)

ThierryBourgoin avatar May 09 '17 17:05 ThierryBourgoin

@ThierryBourgoin I see your point and it makes a lot of sense. There are various ways to look at what the essence of a taxon is and exactly this is why we need to agree on one definition.

We should probably step back and approach the problem from a users perspective. What does a user want from a CoL taxon and why does it need an identifier at all?

  1. someone uses the catalogue at some point and wants to have a persistent reference to the exact version he was looking at that time. That would require a fully versioned CoL with every change triggering a new identifier.

  2. people have identified an organism to a CoL taxon, e.g. a specimen or observation. They want access to the current view of the "same taxon" in the CoL that still represents that organism observed. But maybe with a different name, classification or other updated "metadata". This does not require a taxon concept id per se, just a way to get to the (different) identifier for the latest version of the same concept. The concept identifier basically is internal only - but the system still needs to know about concepts. This mostly applies to species- and infraspecific taxa so we probably would not need to worry about higher taxa, but maybe genera.

  3. researchers want to aggregate species related information from different systems, all linked to CoL taxa. They want to be sure the different systems talk about the same taxon concept and information can safely be transferred and merged. This seems to require shared concept ids.

From the above I feel we need 2 identifier, one for the exact version and one for the taxon concept to assert a concept is the same.

The question now is how to know that a concept (as in set of all theoretically included individuals) is the same. We can either find a way to automatically detect that or rely on experts to tell us. The problem with experts is that they will apply different judgments to what concepts are. So we will see very inconsistent, equal concepts across various groups. Sth that can be asserted by a computer will be much more useful as its predictable and comparable across all groups.

mdoering avatar May 09 '17 21:05 mdoering

Thanks, @ThierryBourgoin and @mdoering -- this is helpful. This conversation is touching on the same problems of communication that have plagued these discussions for several decades now (going back at least to the 1980's). Fundamentally, is that we have different ideas about two issues:

Issue 1 is about what "things" (conceptual entities) do we care enough about to label with a persistent identity. Included within this issue is the question of how to explicitly define these "things", so we know when the properties of one thing (represented by its persistent identifer) should be changed (without changing the identifier), vs. when a new "thing" is needed (with its own distinct identifier). At the heart of this issue is which properties of a "thing" define it (i.e., collectively represent its "essence"), and which merely represent relevant metadata associated with that "thing", which may be altered without altering the essence of the "thing".

Issue 2 is about semantics, that is, which terms do we use to label each class of "thing". The most problematic terms are "name" and "concept". Both have various synonymns and homonyms in our conversations. What has become clear as a result of MANY conversations almost exactly like this one is that we probably have five or six different classes of "things" that we have, over the years, tried to force-fit into two terms ("name" and "concept").

My fear is that if we do not confront these two issues now, we will make very little progress solving these problems from an informatic perspective. Having dealt with these issues (from an informatics perspective) for many years, these are the "things" that I have found useful for persistently representing conceptual objects in the biological taxonomy realm:

Thing 1: An individual human being, or an entity representing an organization created by human beings. I have used the term "Agent" to refer to this Thing.

Thing 2: A text-string label used to represent an instance of Thing 1 ("Agent"), often parsable into "Surname" and "GivenName" (for people), or a hierarchy of names (for organizations). I have used the term "AgentName" to refer to this Thing.

Thing 3: Documentation instance representing assertions made by one or more instances of Thing 1 ("Agent"), at a particular moment in time. The documentation may be a type of publication, or it may be some other form of static documentation. The word "static" here is critical, because the documentation instance represents a snapshot in time, and thus does not change. For retrieval purposes, it is best to associated each instance of Thing 3 with instances of Thing 2 (AgentName), instead of directly to instances of Thing 1 (Agent). I have used the term "Reference" to refer to this Thing.

Thing 4: A string of text characters, typically represented electronically in the form of UTF-8 encoded text, or printed in the form of glyphs rendered as ink on paper, which serves as a Linnean-style scientific name. These text strings may or may not include components representing taxonomic rank, delimiters (such as parentheses), and authorship information (various styles, formatting and with or without years). I have used the term "NameString" to refer to this Thing.

Thing 5: A specific instance of a Linnean-style taxon name represented as a conceptual entity. This applies to a particular unit of a compound name (not the full combination), which has a particular type (specimen or name) in the context of Codes, a particular rank (in the sense of Linnean ranks), and a particlar authorship associated with the creation of the name. This is different from instances of Thing 4 (NameString) in that it is conceptual, not literal. The essence of an instance of Thing 5 is independent of the text string used to represent it. For example, the same instance of Thing 5 might be represented by different text strings (e.g., different genus combinations for a species, different ranks, different spellings, etc.), and more than one instance of Thing 5 might share the same text string (e.g., homonyms, homographs). I have used the term "Protonym" to refer to this Thing.

Thing 6: A particular treatment or usage of an instance of Thing 5 (Protonym) within the context of an instance of Thing 3 (Reference). Important properties of instances of Thing 6 include the exact spelling of the specific name unit (e.g., the species epithet) as it appears within the instance of Thing 3 (Reference), what taxonomic rank the instance of Thing 5 (Protonym) was asserted as within Thing 3 (Reference), Whether or not the instance of Thing 5 (Protonym) was treated as as a valid taxon, or as a heterotypic synonym of another taxon, and a link to another instance of Thing 6 representing the immediate hierarchical taxonomic parent (e.g., the genus into which a species is placed). I have used the term "TaxonNameUsage" to refer to this Thing, but it could also be referred to as "TaxonTreatment" or just "Treatment" (following how PLAZI uses that term).

Thing 7: The set of biological organisms, including individuals that are dead, alive, and yet-to-be-born, which are explicitly or implicitly included within an asserted Taxon. THIS IS THE THING ABOUT WHICH WE ARE DISCUSSING Most people I have discussed these issues with over the years have applied the term "TaxonConcept" and "Circumscription" interchangably to refer to this Thing. However, as per @ThierryBourgoin comments above, perhaps we do not have universal agreement that "Concept" and "Circumscription" are synonymous terms. Therefore I propose we use the term "Circumscription" to represent this Thing, to avoid confusion going forward.

Thing 8: This is the Thing that @ThierryBourgoin refers to in his comment above as a "Concept". Basically, its properties include elements of both Thing 7 (Circumscription, or set of included child entities), as well as Thing 6 (TaxonNameUsage/Treatment), such as the hierarchical classification, treatment as valid or not, and how the name is spelled. Therfore, it is different from Thing 7 (Circumscription) because it is defined by more than just the child items it contains, but it's not the same as an instance of Thing 6 (TaxonNameUsage/Treatment), because there many be many instances of Thing 6 (TaxonNameUsage/Treatment) that all imply the same instance of Thing 8.

I apologize for this long post, but there is a reason we've never solved this issue as a community during the past few decades. Unfortunately, most of that reason has to do with miscommunication, and most of the miscommunication has to do with a mixture of how we define our core objects (Issue 1) and what terms we use to represent them (Issue 2; i.e., semantics).

I believe that we already have well-tested, non-contentious definitions for Things 1, 2, 3, and 4. After the dinner conversation in Woods Hole, I am confident we can fairly quickly settle on a clear definition for Thing 5. If we can achive that, then the definition of Thing 6 is extremely easy. Therefore, the real issue for us to deal with is whether Thing 7 and Thing 8 need to be different Things, or if we can adequately accomodate them with a single Thing. Originally I thought we could get by with a single Thing, but after the comment by @ThierryBourgoin and @mdoering above, it seems we should serious consider defining them as separate things, each with their own identifiers.

In either case, I think it's important that we understand the difference between defining what Things we need to manage in CoL-Plus, and deciding which terms to use to refer to those defined things. I think it would be a grave mistake to start defining data models and such until after we come to consenses on the Things we're managing, ans the terms we're using to refer to those things.

Phew... and this is just the BEGINNING of the discussion!

deepreef avatar May 10 '17 00:05 deepreef

One more point.... in response to the comment by @mdoering above, "versioning" of CoL representations can be handled in several ways:

  1. Internally using version histories for the same identifiers plus a date-stamp;
  2. Geneating new identifiers to represent each version;
  3. Capturing each new version via a new instance of Thing 6 (with Reference representing CoL as the Author and the date of the change as the date, and the properties of spelling, validity, classification, etc.)

There are other ways as well, but #3 above represents the simplest in terms of coding and implementation.

deepreef avatar May 10 '17 00:05 deepreef

Linking the drawing from the Woods Hole CoL meeting April 2017 illustrating changing concepts (numbers) over time with types indicated by colored dots: Concept Changes

Original single species A.bus gets split into A.bus and A.fus. A.bus s.str is then merged with A.xus. Knowing the types alone is not in all cases enough, otherwise A.bus s.l. (1) would be the same as A.bus s.str. (2). But when you know about all the species within the genus and know A.bus is also a pro parte synonym of A.fus you can derive the unique concepts

mdoering avatar Sep 17 '18 12:09 mdoering

I think we need to be precise here about the words we use… (concept, step of concept

If I reed correctly the figure: We have only here 3 different taxonomic concepts: A. xus, A. bus and A. fus. 1960: taxon A. bus s.l. is described (1) 1970: taxon A. xus (4) and taxon A. fus (3) are described. Some specimens of A. bus s.l. belongs to A. fus. We have 2 new concepts (3) and (4) + 1 old concept (1) more restricted BUT still the same concept. 1980: A. xus is synomized with A. bus s.s ; A. fus remains. We have 2 concepts (1) in still another step, and (3).

1, 2 and 5 are different stages.steps of the same taxonomic concept.

Type-bus (red dot) is the same in all stages/steps of the life of the same taxon A. bus (s.l., s.s., and including A. xus). So yes a type does represent all the stages of the life of a taxon, but this is not what it is supposed to do: it is just bearing the name for this taxon. The type has nothing to do with the concept understanding, it is just the bearing-name specimen for this concept. This specimen is only one in the many others that “make" the taxon, it provides the link between nomenclature and taxonomy.

In this example the taxonomic concept for A. bus remains the same, it just evolves in time according to its content (=extension) more or less restrictive (different steps/numbers of the same concept): succesive stages/steps: 1, 2 and 5. => a same concept may have different successive names according to its extension.

However concepts are defined by 1) their content (extension = set of children-taxa/specimens to which the concept applies) AND 2) also by intension (list of its characters = its description) - and not by the type specimen. If a taxon is transferred to another parent taxon with its set of children-taxa (a genus from one tribe to another tribe, a species from one genus to another genus) it changes by intension (its characters/description are/is changed). Accordingly in that case this is no more the same concept ; we have 2 concepts: an old one and a new different one, although it keeps the same name! (excepted brakets in the case of species transfered in another genus). => a same name (particularly in supraspecific taxa) may refers to different concepts.

This is why 1) defining taxa by their extension only remains insuffisant (my issue in Woods Hole meeting) and 2) speaking of a taxon without referring to its classification (e.g. sec. author) might introduce strong biais if not even errors in any taxonomic database is we don’t take care of these very particular inferred links (my point/talk in Xishuangbanna meeting).

ThierryBourgoin avatar Sep 17 '18 14:09 ThierryBourgoin

@ThierryBourgoin so you say all 163 Acacia species that have been moved to the genus Vachellia should be considered different taxa describing a different set of organisms? Identifications to Acacia aroma cannot be safely transferred to Vachellia aroma as their circumscription is different?

mdoering avatar Sep 17 '18 22:09 mdoering

@ThierryBourgoin can you explain what you have in mind when the concept is more restricted but still the same concept? That sentence to me contradicts itself. If some specimens/organisms are excluded it is clearly different.

mdoering avatar Sep 17 '18 22:09 mdoering

I try take an example fro what I've in mind:

Taxonomic concept of the giraffe (G. camelopardalis) has recently been disputed (and still is so far I know) and the species concept been ‘restricted' to the “Northern giraffe”, while 3 other species were recognized (reticulated, Southern and the Masai giraffe)… I regard the initial taxonomic concept of what is G. camelopardalis (s.l.) being still the same but it has been restricted (s.s.) to the north African populations.

Let us say that new analyses will conclude in the future that it is not the case for 2 of them, the Southern and the MasaI taxa. Therefore these 2 separated species will come back ‘inside’ the taxonomic concept of G. camelopardalis which will be more widely understood than now but still less than originally. These are just successive steps of in the circonscription of the same concept view by extension.

Now let us say that new analysis by author NNN would show that Giraffe is not a Ruminant (Ruminantiamorpha) and should be move from Giraffidae to whales in Balaenidae ;-) Then Giraffa would be characterized by its own characters of course (the ones that allow to recognize the set of all its included subtaxa) but also by all the characters of Balaenidae and not the ones of Giraffidae. For me this new definition by intension (new list of characteristics of Giraffa, including those of Balaenidae) would make the taxonomic concept a totally different one for Giraffa sec. NNN.

I don't know if I could write it this way but in other words I would say that changing the content of a taxa does not change it (as a taxonomic concept), but changing its characteristics that it share with other taxa (what we do with taxanomic transfers) yes. From your example Acacia and Vachellia remains the same concept, respective with a more restrictive or wider understanding of their taxonomic concept, but Vachellia aroma and Acacia aroma are two different taxonomic concepts.

ThierryBourgoin avatar Sep 18 '18 10:09 ThierryBourgoin

Thanks @ThierryBourgoin, for identification purposes it is important that we capture the different opinions over time. In the terminology I propose here this means the concept of which populations are in and which are out does change, even though the type remains. In your example of a hypothetical merge of the Southern and Masal species back into G. camelopardalis we would actually have 3 different concepts over time, all known under G. camelopardalis. Referring to all 3 of them as the same concept would not allow us to deal with identifications accurately.

Take a look at iNaturalist to see why that is important for handling (historical) identifications: https://www.inaturalist.org/pages/curator+guide#changes Actual changes they track (unfortunately both Acacia and Giraffe are outdated): https://www.inaturalist.org/taxon_changes

A good bird example for a split based on distribution ranges: https://www.inaturalist.org/taxon_changes/32924

mdoering avatar Sep 18 '18 11:09 mdoering

I do understand your point about intension. The classification should be significant in characters that define the taxon. But in many cases these do not alter the unit of populations that make up the taxon. The important part is that as long as the populations which make up the taxon do not change the taxonomic concept has not changed. Even if the circumscription might now include some more or less characters. The primary anchor point is the group of populations that form a stable unit, not how exactly we characterize them. From Wikipedia:

In biology, a taxon is a group of one or more populations of an organism or organisms seen by taxonomists to form a unit. It is not uncommon, however, for taxonomists to remain at odds over what belongs to a taxon and the criteria used for inclusion.

mdoering avatar Sep 18 '18 11:09 mdoering

As in practice it is difficult to assess whether a change in characters has an actual effect on the size of the included populations it probably makes sense in some cases to track these concepts in their minute details. But this leads us again to an explosion of concepts. Every identification key will define its own concept, every change in classification yet many more. For the purpose of dealing with identifications, whether observations in GBIF or specimens in collections, we would like to have more stable identifiers though than names, not less stable ones.

mdoering avatar Sep 18 '18 11:09 mdoering

Hi Markus,

I think we agree. A same taxa might have different changes in its concept (how it is understood), and tracking these changes is crucial. But all these changes are not equal.

  • Any new specimen added to a taxa changes its concept: it extends its sample, eventually wider its distribution, add new associated biological data… restricts, extends or precise the taxonomic concept that supports the taxon. In fact, almost each time we handle a taxonomic concept (publication) we change it (= chrysonmyie, potential taxon). In practice we are not going to register/identify all these stages but it would be ideal.
  • Because of nomenclature issues, you mainly refer to cases when type specimens are involved/concerned. I agree with you: we need to identify separately these concept changes. They track the evolving definition of the concept by extension.
  • But transfer of taxa within the classification are also important (if not even more) from the pure taxonomy point of view (not nomenclatural) and we don’t track these ones at the moment. This is my point, which I think important for a taxonomic database issue. And these track definition of the concept by intension. Extension and intension are equally important, both participate to the concept definition for a full representation of a taxon in a database I think.

;-), Th.

ThierryBourgoin avatar Sep 18 '18 12:09 ThierryBourgoin

yes. All 3 are probably best dealt with as different identifiers if you need all of them. I am just not sure if we do have users that need all of them. For number two I am sure we have.

mdoering avatar Sep 18 '18 12:09 mdoering

I'm very happy to see this thread back in action and wish to contribute constructively. I need to spend a bit more time reviewing all of this to get back in this frame of mind but I have two immediate comments.

I do not believe that the addition of a new specimen to a taxon changes the concept. The concept is not the specimen. The link between the identifier and the specimen is only through the concept. This is very clear within the famous Triangle of Reference model. In taxonomy, concepts are ideas expressed as publications (sometimes poorly) and anchored with the type. Specimens conspecific with the type are instances of the concept, not new concepts. This is why heterotypy must be the means by which concepts are expressed. The giraffe example is almost identical to the graphic example from Woods Hole (which shows five distinct concepts).

I remain unsettled regarding the higher classification being a property of the concept. Paul Kirk and Jerry Cooper were very resolute on this matter in regard to homotypic synonymy where a taxon was transferred to a different genus. No circumscription change and hence no concept change. A genus transfer is just a smaller iteration than a transfer to a higher group.

If a giraffe is transferred from the ruminants to the whales, then I can see this being a major change in what the whale group is but has the giraffe changed? I can see where a single concept might be sorted into different categories by different parties without the concept itself having to be changed.

For example, when David Patterson inserts the Choanoflagellata as a parent for all metazoa in his Union classification, does he really create all new concepts for all the fulgorids?

DR

dremsen avatar Sep 18 '18 13:09 dremsen

Hi Dave. Yes I'm also happy to see all this back again... ;-)

  • Finding a specimen of Giraffa in South Africa wiould surely modify your concept of Giraffa as being more widely distributed. It will not change the name (‘Signifier' of the concept) but its content, ‘Signified' yes. Distribution is part of the attributs of the concept how we understand the taxa.

  • If a taxon is transferred to another parent taxon, its definition is changed even its circumscription (content) is not changed. There is no one way to define a concept but at the same time by extension and by intension. If Giraffa moves to whales, it acquire all the characteristics proper to whales up to the first commun ancestor of Giraffidae and Whales (Mystic, Cetacea, Whippomorpha). Giraffa concept is completly changed (by intension) having all the successives synapomorphies of these clades. Whales concept is also changed: by intension in incorporating some Giraffa autapomorphies and by extension by incorporating Giraffa.

  • When Choanoflagellata is inserted as parent of Metazoa, characteristics of Choanoflagellata become part of Metazoa lineage and its children taxa and therefore yes of Fulgorids. Leaving Choanoflagellata as sister to Metazoa excludes these characteristics from the metazoan lineage. Of course in practice we don't document these changes but from a formal and logical way it is, as I suppose it is necessary for an accurate representation of taxonomy and its management in biodiversity bioinformatics.

In fact my point here is that

ThierryBourgoin avatar Sep 18 '18 15:09 ThierryBourgoin

In fact my point here it that I would like to be sure that we don't have to redone again this exercise later, because the schema we are using to represent taxonomic knowledge is not enough complete. It was not necessary 20 years ago to separate names from taxa...

;-) Th.

ThierryBourgoin avatar Sep 18 '18 15:09 ThierryBourgoin

Thierry, Certainly I agree with this last sentiment and so wish to be very careful. We need an identifier system that is tractable and has practical value while at the same time being precise enough to have meaning. My perspective is mainly as a user with a particular set of use cases and as a developer examining and trying to model concepts as presented in monographs and fauna's.

dremsen avatar Sep 18 '18 15:09 dremsen

If there is no use case I don't think we should implement it. Keep things simple. It is not bad to refactor things in a few years, but to create something which is not used in the first place is wrong.

The ever changing identifiers in the CoL have been a huge problem for its uptake, we need something far more stable. And in my opinion (based on use cases from GBIF, Collections, iNaturalist and others) something to hold on to a stable taxon regardless of its name. Such a taxonID paired with a nameID is very powerful and would be a serious game changer

mdoering avatar Sep 18 '18 15:09 mdoering

I saw the update came in and wanted to check in. Where do we stand on taxon concept IDs? I've been giving them a lot f of thought recently. I think there are use cases for them. I think they are tractable. I think we can accommodate Thierry's interest in supporting the classification as a component of them. But, referring to a 180918 comment of Thierry's, a separation of names from taxa, or more specifically, syntax from semantics, is a requirement.

dremsen avatar Oct 30 '19 17:10 dremsen