bdq
bdq copied to clipboard
TG2-VALIDATION_COORDINATESTERRESTRIALMARINE_CONSISTENT
TestField | Value |
---|---|
GUID | b9c184ce-a859-410c-9d12-71a338200380 |
Label | VALIDATION_COORDINATESTERRESTRIALMARINE_CONSISTENT |
Description | Does the marine/non-marine biome of a taxon from the bdq:sourceAuthority match the biome at the location given by the coordinates? |
TestType | Validation |
Darwin Core Class | dcterms:Location |
Information Elements ActedUpon | dwc:decimalLatitude |
dwc:decimalLongitude | |
Information Elements Consulted | dwc:scientificName |
Expected Response | EXTERNAL_PREREQUISITES_NOT_MET if either bdq:taxonIsMarine or bdq:geospatialLand are not available; INTERNAL_PREREQUISITES_NOT_MET if (1) dwc:scientificName is bdq:Empty or (2) the values of dwc:decimalLatitude or dwc:decimalLongitude are bdq:Empty or (3) if bdq:assumptionOnUnknownBiome is noassumption and the marine/nonmarine status of the taxon is not interpretable from bdq:taxonIsMarine; COMPLIANT if (1) the taxon marine/nonmarine status from bdq:taxonIsMarine matches the marine/nonmarine status of dwc:decimalLatitude and dwc:decimalLongitude on the boundaries given by bdq:geospatialLand plus an exterior buffer given by bdq:spatialBufferInMeters or (2) if the marine/nonmarine status of the taxon is not interpretable from bdq:taxonIsMarine and bdq:assumptionOnUnknownBiome matches the marine/nonmarine status of dwc:decimalLatitude and dwc:decimalLongitude on the boundaries given by bdq:geospatialLand plus an exterior buffer given by bdq:spatialBufferInMeters; otherwise NOT_COMPLIANT |
Data Quality Dimension | Consistency |
Term-Actions | COORDINATESTERRESTRIALMARINE_CONSISTENT |
Parameter(s) | bdq:taxonIsMarine |
bdq:geospatialLand | |
bdq:spatialBufferInMeters | |
bdq:assumptionOnUnknownBiome | |
Source Authority | bdq:taxonIsMarine default = "World Register of Marine Species (WoRMS)" {[https://www.marinespecies.org/]} {Web service [https://www.marinespecies.org/aphia.php?p=webservice]} |
bdq:geospatialLand default = "Union of NaturalEarth 10m-physical-vectors for Land and NaturalEarth Minor Islands" {[https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_land.zip], [https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_minor_islands.zip]} | |
bdq:spatialBufferInMeters default = "3000" | |
bdq:assumptionOnUnknownBiome default = "noassumption" | |
Specification Last Updated | 2024-08-30 |
Examples | [dwc:decimalLatitude="-41.0525925872862", dwc:decimalLongitude="-71.5310546742521", dwc:scientificName="Aegla neuquensis": Response.status=RUN_HAS_RESULT, Response.result=COMPLIANT, Response.comment="The species is freshwater aquatic and the coordinates fall in a lake and thus COMPLIANT"] |
[dwc:decimalLatitude="20.0", dwc:decimalLongitude="-30.0", dwc:scientificName="Viviparus contectus (Millet, 1813)": Response.status=RUN_HAS_RESULT, Response.result=NOT_COMPLIANT, Response.comment="dwc:scientificName is non-marine according to dwc:taxonIsMarine but coordinates are marine"] | |
Source | ALA, OBIS |
References |
|
Example Implementations (Mechanisms) | |
Link to Specification Source Code | |
Notes | dwc:coordinatePrecicision and dwc:coordinateUncertaintyInMeters (if present) imply a potential displacement of the provided coordinates. These two terms can be considered spatial buffers. Likewise, country polygons cannot be 100% accurate at all scales (Dooley 2005), so a spatial buffer of the country boundaries is justified. Taking the spatial buffers into account does however greatly complicate both the logic and the implementation of such tests. The same applies to potential conversion of the Spatial Reference System (SRS) of dwc:decimalLatitude and dwc:decimalLongitude to the SRS used in the bdq:sourceAuthority. Note that in the current implementation tests treat "brackish" in WoRMS as both marine and terrestrial. Note that both bdq:taxonIsMarine and bdq:geospatialLand are bdq:sourceAuthorities, but as they form two parameters, distinct names are used for them. |
Comment by Paula Zermoglio (@pzermoglio) migrated from spreadsheet: Should we consider whether we are talking about paleo records?
Comment by Paul Morris (@chicoreus) migrated from spreadsheet: terrestrial spatial layer isn't a Darwin Core term, needs to be listed in a different column. Difficult to work with nearshore environments, precision of the GIS layers may not be high enough and coastal boundaries and occurrence locations may need to be buffered as part of a test.
Comment by Paul Morris (@chicoreus) migrated from spreadsheet: OBIS codes taxa as marine, freshwater, brackish, or terrestrial. Definition of terrestrial unclear here, meaning on land, or non-marine, would freshwater taxa be expected to fall inside terrestrial polygons. Some taxa can be in different envrionments depending on where they are in their lifecycle, freshwater, brackish water, and marine phases are not unusual. Brackish water and nearshore environments tend to be problematic for detection of problematic coordinates, particularly without high resolution gis layers.
Comment by Christian Gendreau (@cgendreau) migrated from spreadsheet: don't forget it also implies you "understood" the species so it's extremly difficult to implement. I would definitly not include it in the core
Comment by Paul Morris (@chicoreus) migrated from spreadsheet: dIFFICULT TO IMPLEMENT
This probably needs further discussion. There are two ways this can be done. Some earlier discussion seemed to indicate that it may be better to use the taxon to decide if it is Marine or not - i.e. go the OBIS way and use WORMS - and basically decide on a Taxon basis. Alternatively (and harder to implement) is to use similar techniques to #73 and use a GIS layer. If the latter, then we probably need to add the 3km buffer - but what about things in estuaries (where coastlines are particularly unreliable). I think I favour using the taxon to decide, using WORMS. Whatever we do it needs to be rewritten.
Also see comment by @iDigBioBot above
The test is certainly useful if it can flag species in a wrong location and marine vs non-marine is the most broad first cut.
This test requires a) the habitat identification of the taxon and b) the location using dwc:decimalLatitude and dwc:decimalLongitude and (c), a spatial buffer. (a) could be iffy if is it based on WORMS or IRMNG.
Note that if you are using layers for the marine/terrestrial boundaries - the scale of the land/water interface in the EEZ layers on marineregions.org is a lot more course than that of the country (and hence land/marine interface) GADM country boundaries, for example so I would suggest the latter.
Is the EEZ relevant here? All we need to know is land vs water? Maybe. Australia has a category called 'External territories' that includes a bunch of islands like Cocos, Heard, McDonald, Norfolk and I guess from https://www.ga.gov.au/scientific-topics/marine/jurisdiction/maritime-boundary-definitions that these are part of the EEZ.
The GADM country layers should have the islands, so the EEZ is probably not as relevant and is at a muh worse scale. I would just use GADM.
Looking at the Expected Response in this one - it is not clear if we are using geographic boundaries or relying on taxon IRMNG and the OBIS codes to determine isMarine. As written we seem to be having bets both ways.
As far as I am concerned, there are at least two bdq:sourceAuthority references here. The first is either WORMS or IRMNG and the second is GADM. A potential third may be EEZs.
I cannot see why we don't have an EXTERNAL_PREREQUISITES_NOT_MET
We again have the potential problem of spatial buffers. As per #73 (and others), buffering makes for serious complications. I would be happier to accept the false positive here if we skip the complexities (and put scenarios with buffers in Notes) as a VALIDATION, than I would for an AMENDMENT as in #73.
It depends on whether you go the spatial route - then GADM, if you go the taxon route then IRMNG - I think we have to use one or the other and not both? The IRMNG I think includes the OBIS Codes for marine, freshwater, brackish, or terrestrial - perhaps these could be used in some way. If you go the Spatial route, then you have the problem of fuzzy coastlines (so not going to be accurate within 3km if using global layers and not a localised GIS), and the IRMNG may thus be more accurate. It may need be something that is tested against some real data. If we go the taxon route, then we are at least consistent with OBIS.
OBIS uses the habitat values from WoRMS not IRMNG. A taxa can have more than one value. Only those tagged with marine appear in the OBIS portal. OBIS does report on suspect terrestrial locations but the data still appears. Obviously if it is a seabird it can be inland - migrating or nesting. We have seen terrestrial birds fly south in the southern ocean (likely never to return) yet a valid observation. Many observations of marine animals are done from the coast so appear in the wrong spot. Salt water crocs travel up rivers so appear inland even if tagged as marine. I think the WoRMS taxonomic editors are more likely to tag the taxa correctly as marine versus using observation records to define 'marine'
Thanks @davewatts3 - very valuable contribution.
an addition to @davewatts3 on the WoRMS/Aphia ID values. The Aphia taxonomic register is used increasingly for non-marine taxa as well in the Framework of the "Lifewatch Species Information Backbone"(http://www.lifewatch.be/en/lifewatch-species-information-backbone). Depending on the context more or less information can be found using filters relevant for that context. As @iDigBioBot mentioned in Aphia Environment distinguishes 'Marine', 'Brackish', 'Fresh' or 'Terrestrial' with an indication of yes, no or Unknown. So catadromous fish like eels can be labelled with yes for Marine', 'Brackish' and 'Fresh'.
Currently having the label Marine for Marine organism is probably the best filled out value. With Brackish, Fresh and Terrestrial catching up.
For the Antarctic we currently have two registers that build on Aphia, the register for Antarctic Marine Species, that only displays Marine/brackish species and the Register of Antarctic species displays all species, but in the backend they but run on Aphia.
From an Antarctic perspective the GADM layer is quite coarse. So even when buffered it would not be very reliable.
COMPLIANT if a terrestrial taxon represented by the Source Authority for marine and terrestrial biomes has geographic coordinates that fall within terrestrial boundaries plus an exterior buffer given by bdq:spatialBufferMeters or a marine taxon represented by the source Authority for marine and terrestrial biomes has geographic coordinates that fall within marine boundaries plus an exterior buffer given by bdq:spatialBufferMeters; otherwise NOT_COMPLIANT
Remove references to bdq:isMarine, references an external Kurator actor that we aren't using.
PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if the terrestrial or marine status of the taxon is not provided or is ambiguous from the bdq:sourceAuthority or the values of dwc:decimalLatitude or dwc:decimalLongitude are EMPTY; COMPLIANT if a taxon coded as terrestrial by the bdq:sourceAuthority has geographic coordinates that fall within terrestrial boundaries plus an exterior buffer given by bdq:spatialBufferMeters or a marine taxon according to the bdq:sourceAuthority has geographic coordinates that fall within marine boundaries plus an exterior buffer given by bdq:spatialBufferMeters; otherwise NOT_COMPLIANT
The specification describes assessing using the marine/non-marine status of the taxon, but no taxon term is included as an information element.
Oops. I agree @chicoreus. Can we get away with dwc:scientificName? Admitted, amendments may affect the taxon identity.
Discussed in call 2022 Feb 27, dwc:scientificName added as an information element, dwc:taxonID not added, as if it isn't an aphia id, it likely complicates the lookup of marine/non-marine status, and we have an amendment to set scientificName from taxonID. This leaves potential issues with homonyms, but for flagging potentially problematic data, these would represent edge cases.
This test is different from others in having more than one source authority (one for an is marine lookup, the other for land/marine geospatial data). I've represented this here as bdq:sourceAuthority[taxonismarine, geospatialland], suggesting a map of two source authorities with distinguishable targets, rather than proposing a second source authority term, e.g. bdq:sourceAuthority2.
@Antonarctica I've suggested a default geospatial data set of a union of natural earth land with natural earth minor islands, will this work better for Antartartic uses than GADM?
@tucotuco @ArthurChapman please check that the geospatial language makes sense.
For clarity, proposing change of the specification from:
EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority service was not available; INTERNAL_PREREQUISITES_NOT_MET if the non-marine/marine status of the taxon is not provided or is not interpretable from bdq:sourceAuthority or the values of dwc:decimalLatitude or dwc:decimalLongitude are EMPTY; COMPLIANT if a taxon coded as non-marine by the bdq:sourceAuthority has geographic coordinates that fall within non-marine boundaries plus an exterior buffer given by bdq:spatialBufferInMeters or a marine taxon according to the bdq:sourceAuthority has geographic coordinates that fall within marine boundaries plus an exterior buffer given by bdq:spatialBufferInMeters; otherwise NOT_COMPLIANT
To:
EXTERNAL_PREREQUISITES_NOT_MET if a bdq:sourceAuthority service was not available; INTERNAL_PREREQUISITES_NOT_MET if the non-marine/marine status of the taxon is not provided or is not interpretable to a set of identical values for all taxon matches in bdq:sourceAuthority[taxonomyismarine] or the values of dwc:decimalLatitude or dwc:decimalLongitude are EMPTY; COMPLIANT if a taxon coded as non-marine by the bdq:sourceAuthority[taxonomyismarine] has geographic coordinates that fall within non-marine boundaries given by bdq:sourceAuthority[geospatialland] plus an exterior buffer given by bdq:spatialBufferInMeters or a marine taxon according to the bdq:sourceAuthority has geographic coordinates that fall within marine boundaries given by bdq:sourceAuthority[geospatialland] plus an exterior buffer given by bdq:spatialBufferInMeters; otherwise NOT_COMPLIANT
Adding to the notes that "is not interpretable to a set of identical values for all taxon matches" is intended to mean that if homonyms are matched by a scientificName, and they all have the same marine/brackish/non-marine status, then the test can continue, but if they differ, then the test must return prerequisites not met (e.g. if Aus bus matches a land plant and a land snail, the test can continue, but if it matches a land plant and a marine worm, then the prerequisites aren't met). If there are multiple name matches, but they all have the same status, then it doesn't matter, but if they are different in terrestrial/marine status, then there isn't enough information to continue.
Looks good. I wonder if we should add bdq before the taxonomyismarine etc. and use camelCase = e.g. "bdq:sourceAuthority[bdq:taxonomyIsMarine] and bdq:sourceAuthority[bdq:geospatialLand]
Alternatively we could use bdq:taxonomyIsMarine and bdq:geospatialLand (as you have used bdq:spatialBufferInMeters) and define them in the Glossary. If we used this, we could change the first part to read
EXTERNAL_PREREQUISITES_NOT_MET if a bdq:sourceAuthority service (bdq:taxonomyIsMarine, bdq:geospatialLand, or bdq:spatialBufferInMeters) was not available; ....
Thanks @chicoreus and @ArthurChapman. I agree with you @chicoreus in regards the syntax options bdq:sourceAuthority[taxonomyismarine] and bdq:sourceAuthority[geospatialland]. I think this is simpler @ArthurChapman than repeating bdq - which could get confusing.
I think the new Expected Response needs a double-barrel on the EXTERNAL however, and maybe explicit reference to dwc:scientificName, and a simplification of the geographic bit?
EXTERNAL_PREREQUISITES_NOT_MET if a bdq:sourceAuthority[taxonomyismarine] or bdq:sourceAuthority[geospatialland] service was not available; INTERNAL_PREREQUISITES_NOT_MET if dwc:dcientificName was EMPTY or the non-marine/marine status of the taxon is not interpretable from bdq:sourceAuthority[taxonomyismarine] or the values of dwc:decimalLatitude or dwc:decimalLongitude are EMPTY; COMPLIANT the taxon marine/non-marine status from bdq:sourceAuthority[taxonomyismarine] matches the status marine/non-marine status of dwc:decimalLatitude and dwc:decimalLongitude on the boundaries given by bdq:sourceAuthority[geospatialland] plus an exterior buffer given by bdq:spatialBufferInMeters; otherwise NOT_COMPLIANT
I like the version from @Tasilee except for the word "service". Can we strike that everywhere it occurs to avoid confusion about resources made locally from the authorities?
Seems ok to me @tucotuco. I'm happy to make the changes once we have comments from @chicoreus and @ArthurChapman.
And just noticed a typo in my ER above with status (and poor English) and made consistent "marine/non-marine"...try this...
EXTERNAL_PREREQUISITES_NOT_MET if a bdq:sourceAuthority[taxonomyismarine] or bdq:sourceAuthority[geospatialland] was not available; INTERNAL_PREREQUISITES_NOT_MET if dwc:dcientificName was EMPTY or the marine/non-marine status of the taxon is not interpretable from bdq:sourceAuthority[taxonomyismarine] or the values of dwc:decimalLatitude or dwc:decimalLongitude are EMPTY; COMPLIANT the taxon marine/non-marine status from bdq:sourceAuthority[taxonomyismarine] matches the marine/non-marine status of dwc:decimalLatitude and dwc:decimalLongitude on the boundaries given by bdq:sourceAuthority[geospatialland] plus an exterior buffer given by bdq:spatialBufferInMeters; otherwise NOT_COMPLIANT
There is some syntax error in the above - suggest
EXTERNAL_PREREQUISITES_NOT_MET if either bdq:sourceAuthority[taxonomyismarine] or bdq:sourceAuthority[geospatialland] are not available; INTERNAL_PREREQUISITES_NOT_MET if dwc:dcientificName was EMPTY or the non-marine/marine status of the taxon is not interpretable from bdq:sourceAuthority[taxonomyismarine] or the values of dwc:decimalLatitude or dwc:decimalLongitude are EMPTY; COMPLIANT the taxon marine/non-marine status from bdq:sourceAuthority[taxonomyismarine] matches the status marine/non-marine status of dwc:decimalLatitude and dwc:decimalLongitude on the boundaries given by bdq:sourceAuthority[geospatialland] plus an exterior buffer given by bdq:spatialBufferInMeters; otherwise NOT_COMPLIANT
Still not quite right. Try this...
EXTERNAL_PREREQUISITES_NOT_MET if either bdq:sourceAuthority[taxonomyismarine] or bdq:sourceAuthority[geospatialland] are not available; INTERNAL_PREREQUISITES_NOT_MET if dwc:dcientificName was EMPTY or the marine/non-marine status of the taxon is not interpretable from bdq:sourceAuthority[taxonomyismarine] or the values of dwc:decimalLatitude or dwc:decimalLongitude are EMPTY; COMPLIANT if the taxon marine/non-marine status from bdq:sourceAuthority[taxonomyismarine] matches the marine/non-marine status of dwc:decimalLatitude and dwc:decimalLongitude on the boundaries given by bdq:sourceAuthority[geospatialland] plus an exterior buffer given by bdq:spatialBufferInMeters; otherwise NOT_COMPLIANT
Just one small error - add "if" after COMPLAINT
ah. Done.