comb. nov. and provis. should parse
Hello,
I found that the following strings produce parsing errors, while I think they should not. The last one is probably the case with most impact, as a subspecies is ignored.
In any case, thank you for this wonderful work !
Caloplaca sol Orange sp. nov.
{
"parsed": true,
"quality": 4,
"qualityWarnings": [
{
"quality": 4,
"warning": "Name is approximate"
}
],
"verbatim": "Caloplaca sol Orange sp. nov.",
"normalized": "Caloplaca sol Orange",
"canonical": {
"stemmed": "Caloplaca sol",
"simple": "Caloplaca sol",
"full": "Caloplaca sol"
},
"cardinality": 0,
"authorship": {
"verbatim": "Orange",
"normalized": "Orange",
"authors": [
"Orange"
]
},
"surrogate": "APPROXIMATION",
"id": "a840ca49-98aa-5dcb-b36f-5c0998bb2e40",
"parserVersion": "v1.11.2"
}
Caloplaca calcitrapa Nav.-Ros., Gaya et Cl. Roux comb. nov.
{
"parsed": true,
"quality": 4,
"qualityWarnings": [
{
"quality": 4,
"warning": "Unparsed tail"
}
],
"verbatim": "Caloplaca calcitrapa Nav.-Ros., Gaya et Cl. Roux comb. nov.",
"normalized": "Caloplaca calcitrapa Nav.-Ros., Gaya \u0026 Cl. Roux",
"canonical": {
"stemmed": "Caloplaca calcitrap",
"simple": "Caloplaca calcitrapa",
"full": "Caloplaca calcitrapa"
},
"cardinality": 2,
"rank": "sp.",
"authorship": {
"verbatim": "Nav.-Ros., Gaya et Cl. Roux",
"normalized": "Nav.-Ros., Gaya \u0026 Cl. Roux",
"authors": [
"Nav.-Ros.",
"Gaya",
"Cl. Roux"
]
},
"tail": " comb. nov.",
"id": "f5c35bcc-62e3-5b59-8557-5738a51494a1",
"parserVersion": "v1.11.2"
}
Caloplaca xerothermica (Vondrák, Arup et I. V. Frolov) Cl. Roux comb. nov. provis. subsp. xerothermica
{
"parsed": true,
"quality": 4,
"qualityWarnings": [
{
"quality": 4,
"warning": "Unparsed tail"
}
],
"verbatim": "Caloplaca xerothermica (Vondrák, Arup et I. V. Frolov) Cl. Roux comb. nov. provis. subsp. xerothermica",
"normalized": "Caloplaca xerothermica (Vondrák, Arup \u0026 I. V. Frolov) Cl. Roux",
"canonical": {
"stemmed": "Caloplaca xerothermic",
"simple": "Caloplaca xerothermica",
"full": "Caloplaca xerothermica"
},
"cardinality": 2,
"rank": "sp.",
"authorship": {
"verbatim": "(Vondrák, Arup et I. V. Frolov) Cl. Roux",
"normalized": "(Vondrák, Arup \u0026 I. V. Frolov) Cl. Roux",
"authors": [
"Vondrák",
"Arup",
"I. V. Frolov",
"Cl. Roux"
]
},
"tail": " comb. nov. provis. subsp. xerothermica",
"id": "64ada959-d61d-5c33-a28c-1240259f8e8d",
"parserVersion": "v1.11.2"
}
Thank you, @aguilbau, for identifying the issue and providing feedback, and for your kind words.
Caloplaca sol Orange sp. nov. was incorrectly parsed. I need to prioritize the parsing of sp. nov. over sp. to correct this.
Caloplaca calcitrapa Nav.-Ros., Gaya et Cl. Roux comb. nov. was parsed as expected.
Caloplaca xerothermica (Vondrák, Arup et I. V. Frolov) Cl. Roux comb. nov. provis. subsp. xerothermica was also parsed as expected, as annotation went inside of the name, which makes it hard to salvage the subspecies part.
Annotations such as sp. nov., comb. nov., etc., are not considered part of the scientific name itself and are therefore placed in the 'unparsed tail'. However, in the first example, sp. nov. was incorrectly interpreted as sp., which needs to be fixed.
Names like Aus sp. bus are considered surrogate names, indicating certainty about the genus (Aus) but not the species.
Caloplaca sol Orange sp. nov. refers to the species Caloplaca sol, described by Orange, with the annotation sp. nov. indicating it is a newly described species: https://www.cambridge.org/core/journals/lichenologist/article/caloplaca-sol-teloschistaceae-a-new-coastal-lichen-from-great-britain/D278DCB51C0E756F1F3CF42BA89D2F95"
Thank you for your answer. Indeed, I found out that novel status indications are not strictly required by the code of nomenclature, although they are considered best practice.
For Caloplaca xerothermica (Vondrák, Arup et I. V. Frolov) Cl. Roux comb. nov. provis. subsp. xerothermica, do you think the author should have written Caloplaca xerothermica (Vondrák, Arup et I. V. Frolov) Cl. Roux subsp. xerothermica comb. nov. provis.?
A name has a long term life and is 'permanent', while annotations make sense only in a context. For example, every species name at some point is sp. nov, but not later. Some annotations explain how name was used from taxonomical perspective, and as such have a 'taxonomical context'. Authorship strictly speaking is also not a part of the name, this is why there is such a variability in authorship rendering. However authorship is as 'permanent' as the name itself, and we try our best to parse it. Often the authorship is for example very helpful to distinguish homonyms.
I am going to close it, as annotations are currently out of scope of gnparser