arctos icon indicating copy to clipboard operation
arctos copied to clipboard

Code Table Request - Create locality attribute "waterbody", with values that are aquatic names from IHO (WITHOUT containing associated shapes)

Open genevieve-anderegg opened this issue 1 year ago • 80 comments

Instructions

This is a template to facilitate communication with the Arctos Code Table Committee. Submit a separate request for each relevant value. This form is appropriate for exploring how data may best be stored, for adding vocabulary, or for updating existing definitions.

Reviewing documentation before proceeding will result in a more enjoyable experience.


Initial Request

Goal

Describe what you're trying to accomplish. This is the only necessary step to start this process. The Committee is available to assist with all other steps. Please clearly indicate any uncertainty or desired guidance if you proceed beyond this step.

As discussed today in the Arctos Geography Committee: Collections with aquatic specimens need a way to both associate in the catalog record 1) the administrative area associated with the specimen for permitting purposes, and 2) the non-administrative name of the water body the specimen was collected in. Collections have aquatic specimens with varying levels of detail of collection data (a lot to very little), which means specimens with sparse collection data (e.g. "Pacific Ocean") are often not georeferenced. However, it would be ideal to be able to curatorially apply the names of water bodies to catalog record specimen collection events regardless if they are georeferenced or not. Currently, the ability to ingest new shape files into Arctos and the associated cleanup/association of various shape files in different spatial layers (which will be needed for specimens collected on "edges", like "off the coast of Florida, in the Atlantic Ocean") is limited by manpower, processing power, and funding. Therefore, the Geography Committee proposes that the names of water bodies should be able to be linked to catalog records through a controlled Locality Attribute table (with NO associated shape files attached to water body names) so that collections are able to more easily attach the name of a water body in a controlled name to a record (which will assist with searching within Arctos, allowing to search for both records with and without geolocation and coordinates), and also bypassing the current issues with spatial data (https://github.com/ArctosDB/arctos/issues/6521). If/when the issues with spatial data is overcome, these data could also assist with linking shape files back to records via this locality attribute. Additionally, this attribute could be pointed at the waterbody field in aggregators such as GBIF to assist with searching in those services.

Geography members, please edit/add/correct anything stated here!! @dustymc @happiah-madson @sharpphyl @ekrimmel @falco-rk

Context

Describe why this new value is necessary and existing values are not.

This new locality attribute will allow collections to link the name of a water body to a catalog record regardless of the quality of the specimen collection data

Table

Code Tables are http://arctos.database.museum/info/ctDocumentation.cfm. Link to the specific table or value. This may involve multiple tables and will control datatype for Attributes. OtherID requests require BaseURL (and example) or explanation. Please ask for assistance if unsure.

https://arctos.database.museum/info/ctDocumentation.cfm?table=ctlocality_attribute_type#

Proposed Value

Locality attribute of "waterbody", with pick list of controlled names that are already in Arctos, currently associated with IHO and Marine Regions shape files, but WITHOUT linking to these shape files. Dusty has the names of waterbodies in Arctos from these services already, but this issue proposes just using the names of these shapes as an attribute (and NOT the shapes).

Proposed Definition

~~See above~~

~~Any significant accumulation of water on the surface of Earth or another planet. The term most often refers to oceans, seas, and lakes, but it may include smaller pools of water such as ponds. A body of water does not have to be still or contained; rivers, streams, and canals are also considered bodies of water.~~

  • from @Jegelewicz in https://github.com/ArctosDB/arctos/issues/7374#issuecomment-2163250149
  • nevermind see https://github.com/ArctosDB/arctos/issues/7374#issuecomment-2166251993

~~A body of water.~~

  • from @Jegelewicz https://github.com/ArctosDB/arctos/issues/7374#issuecomment-2166901785, sentensed.
  • nevermind I hate this too

A body of water. Acceptable values are listed in https://www.marineregions.org/ with "PlaceType" in (IHO Sea Area). Documentation URL must be a marineregions.org MRGID. Water Body Type must be a marineregions.org PlaceType. For example, Water Body=Arctic Ocean; Water Body Type=IHO Sea Area; Documentation URL=http://marineregions.org/mrgid/1906.

  • from @dustymc in https://github.com/ArctosDB/arctos/issues/7374#issuecomment-2168430672

Attribute Extras

Attribute data type

If the request is for an attribute, what values will be allowed? free-text, categorical, or number+units depending upon the attribute (TBA)

Pick list

Attribute controlled values

If the values are categorical (to be controlled by a code table), add a link to the appropriate code table. If a new table or set of values is needed, please elaborate.

Attribute units

if numerical values should be accompanied by units, provide a link to the appropriate units table.

Part preservation attribute affect on "tissueness"

if a new part preservation is requested, please add the affect it would have on "tissueness": No Influence, Allows, or Denies

Priority

Please describe the urgency and/or choose a priority-label to the right. You should expect a response within two working days, and may utilize Arctos Contacts if you feel response is lacking.

High

Example Data

Requests with clarifying sample data are generally much easier to understand and prioritize. Please attach or link to any representative data, in any form or format, which might help clarify the request.

Available for Public View

Most data are by default publicly available. Describe any necessary access restrictions.

Helpful Actions

  • [x] Add the issue to the Code Table Management Project.

  • [x] Please reach out to anyone who might be affected by this change. Leave a comment or add this to the Committee agenda if you believe more focused conversation is necessary.

@ArctosDB/arctos-code-table-administrators

Approval

All of the following must be checked before this may proceed.

The How-To Document should be followed. Pay particular attention to terminology (with emphasis on consistency) and documentation (with emphasis on functionality). No person should act in multiple roles; the submitter cannot also serve as a Code Table Administrator, for example.

  • [x] Code Table Administrator[1] - check and initial, comment, or thumbs-up to indicate that the request complies with the how-to documentation and has your approval
  • [x] Code Table Administrator[2] - check and initial, comment, or thumbs-up to indicate that the request complies with the how-to documentation and has your approval
  • [ ] DBA - The request is functionally acceptable. The term is not a functional duplicate, and is compatible with existing data and code.
  • [ ] DBA - Appropriate code or handlers are in place as necessary. (ID_References, Media Relationships, Encumbrances, etc. require particular attention)

Rejection

If you believe this request should not proceed, explain why here. Suggest any changes that would make the change acceptable, alternate (usually existing) paths to the same goals, etc.

  1. Can a suitable solution be found here? If not, proceed to (2)
  2. Can a suitable solution be found by Code Table Committee discussion? If not, proceed to (3)
  3. Take the discussion to a monthly Arctos Working Group meeting for final resolution.

Implementation

Once all of the Approval Checklist is appropriately checked and there are no Rejection comments, or in special circumstances by decree of the Arctos Working Group, the change may be made.

  • [ ] Review everything one last time. Ensure the How-To has been followed. Ensure all checks have been made by appropriate personnel.

  • [ ] Add or revise the code table term/definition as described above. Ensure the URL of this Issue is included in the definition. URLs should be included as text, separated by spaced pipes. Do not include HTML in definitions.

Close this Issue.

DO NOT modify Arctos Authorities in any way before all points in this Issue have been fully addressed; data loss may result.

Special Exemptions

In very specific cases and by prior approval of The Committee, the approval process may be skipped, and implementation requirements may be slightly altered. Please note here if you are proceeding under one of these use cases.

  1. Adding an existing term to additional collection types may proceed immediately and without discussion, but doing so may also subject users to future cleanup efforts. If time allows, please review the term and definition as part of this step.
  2. The Committee may grant special access on particular tables to particular users. This should be exercised with great caution only after several smooth test cases, and generally limited to "taxonomy-like" data such as International Commission on Stratigraphy terminology.

genevieve-anderegg avatar Feb 05 '24 21:02 genevieve-anderegg

Here's some data to get this started. I was hoping to somehow use higher geog as a start to documentation, but it doesn't seem very helpful and I think that's just going to make messes. Assuming this proceeds (I don't see any technical problems with it) It's probably ideal to open an issue for each of these BUT that would be overwhelming and these aren't exactly functional terms, so maybe it makes sense to create the code table from a spreadsheet. If that works, these (or whatever of them you want to create as ctwaterbody) would need a definition added.

temp_maybewaterbody.csv

@ArctosDB/arctos-code-table-administrators does that sound right/reasonable?


Mostly unrelated, I also tried to pull geography from some DMNS coordinates. That's no problem to do, but it's slow and I think probably dangerously expensive - which still seems like some sort of 'database is broken' problem to me, but it's not one that I can see or fix. Anyway, here's the few records I processed, maybe it'll be useful....



 locality_id |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          geogs                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
-------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    10017175 | Mexico|North America|Pacific Ocean, Gulf of California|Mexico, Baja California
     1151427 | NONE
    10000187 | NONE
    10001001 | NONE
    10007230 | NONE
       80277 | NONE
    10001465 | NONE
    10075367 | North America|United States|United States, Florida|United States, Florida, Miami-Dade County
    10097902 | Solomon Islands, Central|Solomon Islands, Malaita|South Pacific Ocean|Solomon Islands, Isabel|Solomon Islands|Pacific Ocean, Solomon Sea
    10099964 | South Atlantic Ocean|Brazil, Bahia|Brazil|South America
    10097462 | South Island|South Pacific Ocean|New Zealand, Canterbury|New Zealand
    10000673 | Tanzania, Tanga|Africa|Somalia, Shabeellaha Hoose|Somalia, Bay|Somalia, Gedo|Somalia, Jubbada Dhexe|Somalia, Jubbada Hoose|Uganda|Tanzania|Ethiopia|Kenya, Isiolo|Uganda, Kumi|Uganda, Pallisa|Uganda, Lake Victoria|Uganda, Lira|Uganda, Luwero|Uganda, Masaka|Uganda, Mayuge|Uganda, Mbale|Uganda, Moroto|Uganda, Mpigi|Uganda, Mukono|Uganda, Nakapiripirit|Uganda, Nakasongola|Uganda, Sironko|Uganda, Soroti|Uganda, Wakiso|Uganda, Pader|Uganda, Apac|Uganda, Bugiri|Uganda, Busia|Uganda, Kitgum|Uganda, Iganga|Uganda, Jinja|Uganda, Kaberamaido|Uganda, Kalangala|Uganda, Kamuli|Uganda, Kapchorwa|Uganda, Katakwi|Uganda, Kayunga|Uganda, Kotido|Kenya, Baringo|Somalia|Uganda, Kampala|Uganda, Tororo|Tanzania, Kagera|Kenya, Laikipia|Indian Ocean|Tanzania, Manyara|Kenya, Kwale|Kenya, Samburu|Kenya, Kiambu|Kenya, Kericho|Kenya, Meru|Kenya, Narok|Kenya, Nairobi|Ethiopia, Oromia|Kenya, Bomet|Kenya, Bungoma|Kenya, Busia|Kenya, Elgeyo-Marakwet|Kenya, Embu|Kenya, Garissa|Kenya, Tana River|Tanzania, Simiyu|Kenya, Kitui|Kenya, Lamu|Kenya, Kilifi|Kenya, Mombasa|Tanzania, Singida|Kenya|Tanzania, Mwanza|Tanzania, Kilimanjaro|Ethiopia, Somali|Ethiopia, Southern Nations, Nationalities|Kenya, Nakuru|Kenya, Kisumu|Kenya, Machakos|Kenya, Makueni|Tanzania, Mara|Kenya, Marsabit|Tanzania, Arusha|Tanzania, Dodoma|Kenya, Kajiado|Kenya, West Pokot|Kenya, Kisii|Kenya, Migori|Kenya, Murang'a|Kenya, Nandi|Kenya, Nyamira|Kenya, Nyandarua|Kenya, Nyeri|Kenya, Siaya|Kenya, Taita Taveta|Kenya, Tharaka-Nithi|Kenya, Trans Nzoia|Kenya, Turkana|Kenya, Uasin Gishu|Kenya, Vihiga|Kenya, Wajir|Kenya, Homa Bay|Kenya, Kakamega|Kenya, Kirinyaga|Kenya, Mandera
     1156333 | The Coastal Waters of Southeast Alaska and British Columbia|United States|North America|Mitkoff|Kupreanof|United States, Alaska|Petersburg Quad|United States, Alaska, City and Borough of Wrangell
    10095356 | The Coastal Waters of Southeast Alaska and British Columbia|United States, Washington, Clallam County|United States, Washington|United States|North Pacific Ocean
     1150164 | United States, California, Humboldt County|United States, California|United States|North Pacific Ocean
     1161545 | United States, California, San Diego County|Mexico, Baja California|United States, California|Coronado|Mexico|United States|North America|North Pacific Ocean
     1114126 | United States, California, San Diego County|United States, California, Orange County|Mexico, Baja California|United States, California|United States, California, Imperial County|Coronado|Mexico|United States|United States, California, Riverside County|North America|North Pacific Ocean
     1131674 | United States, Colorado|United States, Colorado, Grand County|United States|North America
    10091758 | United States, Oregon, Multnomah County|United States, Oregon|Government|United States, Oregon, Clackamas County|United States, Oregon, Washington County|United States, Washington|United States|United States, Washington, Clark County|North America|North Pacific Ocean
    10095240 | United States, Washington, Mason County|United States, Washington, Pierce County|United States, Washington, Kittitas County|United States, Washington, Lewis County|United States, Washington, Kitsap County|Fox|Anderson|Blake|Bainbridge|United States, Washington, Grays Harbor County|Canada, British Columbia|Portage|The Coastal Waters of Southeast Alaska and British Columbia|United States, Washington, Skagit County|United States, Washington, Chelan County|United States, Washington, Clallam County|United States, Washington, Yakima County|Sucia|United States, Washington, Snohomish County|United States, Washington|United States, Washington, King County|Blakely|Pender|Portland|San Juan|Cypress|Tumbo|Mayne|Protection|Patos|Sidney|United States|Guemes|Henry|Saturna|United States, Washington, Thurston County|Burrows|Lumni|Marrowstone|Orcas|North America|Vancouver|Vashon|Waldron|United States, Washington, Island County|Piers|Samuel Island|Shaw|United States, Washington, Whatcom County|Sinclair|McNeill|Speiden|Decatur|Canada|United States, Washington, Jefferson County|United States, Washington, San Juan County|Allan|Lopez|Oak Harbor|Salt Spring|Coal|Discovery|Hat|Harstine|James


dustymc avatar Feb 05 '24 22:02 dustymc

@dustymc

would need a definition added.

I'm not sure what a definition for each entity is. For example, the first entry Adriatic Seas has the following data in higher geography:

Mediterranean Sea, Adriatic Sea | Mediterranean Sea | Adriatic Sea | https://en.wikipedia.org/wiki/Adriatic_Sea | spatial data source iho_world_seas::Adriatic Sea | TRUE | 10007368

Do you need something more than this? I was hoping that the table used to limit/structure the locality attribute "water body" could be the same as the IHO seas in Continents/Oceans or a subsection of it, but will work on definitions if something else is needed.

As Genna said, we are not expecting shape files to be associated with these entries as that may cause a conflict with the shape file of the administrative unit (GADM) also assigned to the catalog record. We do want the data in this locality attribute to be mapped to DC:water body for consolidators.

@falco-rk and Hannah at OGL may have different needs, but this should be some help to getting their data into Arctos.

here's the few records I processed,

Not having a before and after, I'm not sure what changes you made. Should I try to figure that out or do you not plan to pursue this approach any further?

Thanks again for working with us to find a way to capture this data more effectively.

sharpphyl avatar Feb 06 '24 00:02 sharpphyl

Will implementation mean removal of the IHO Seas from higher geography?

Jegelewicz avatar Feb 06 '24 15:02 Jegelewicz

removal of the IHO Seas from higher geography?

Not at all.

The "functional" alternative right now would be to assert two localities

  • California (from GADM)
  • Pacific Ocean (from IHO)

The "maybe needs https://github.com/ArctosDB/arctos/issues/5597" alternative would be to find a spatial data source that does whatever's desired, or to create a "merge layer" using bits of whatever sources might do whatever's desired, make that an authority, and then just using it for beachy-things in CA would include California and Pacific Ocean. I'm not sure if this pursuit has been exhausted, but it probably should be before we go somewhere else.

This just lets someone say "Pacific Ocean" alongside a "normal" locality assertion - and doesn't have enough data to complain when the coordinates and assertion are in Florida (and that should probably be made clear from any eventual definitions, there's clearly some confusion in how this stuff works).

I think the goal involves https://github.com/ArctosDB/arctos/issues/7348 (but ???, and ???? re: how that fits with existing mappings).

Maybe related to the above, maybe a shortcoming in our model, It was mentioned a few times that we need waterbody, I never managed to understand a clear example of that.

first entry Adriatic Seas

I was getting a bunch of them - these terms are unique here but components of bigger/smaller things in the source. I can try to look at it again when I can find time, get you a geog dump, ???? - and maybe @Jegelewicz can help figure out what's useful here. (I'm not sure these need defined, they're not functional in any way that I can see, but that idea hurts my brain...)

same as the IHO seas in Continents/Oceans

Same terms, but a different structure is necessary. (And the data I dumped isn't just sea, maybe I got lost, let me know if you need something else.)

not expecting shape files

To be clear: That's just not possible here, it's a different kind of data in a different structure.

before and after,

I didn't change anything, I just ran your coordinates against existing spatial data (in an effort to help us all understand what a spatial approach to this means). Eg https://arctos.database.museum/editLocality.cfm?locality_id=10095240 you say "United States, Washington" but the spatial data (your coordinates) claim it could be in a couple dozen other things.

I was actually listening to some comments in the discussion and trying to read between the lines of @ekrimmel 's survey wondering if ya'll wouldn't be happier if all of "higher geography" was something with which you could do WHATEVER, and the spatial stuff something completely separate, not asserted, just coordinate-derived. I got the idea that not allowing "Amundson Sea, Kansas" was seen as some sort of fault or drawback of Arctos, and that for various reasons (permits? tradition? previous political situations?? IDK...) you might want to assert a near-infinite number of things that look strange from a spatial perspective. New Issue if any of that's anywhere near the mark...

dustymc avatar Feb 06 '24 16:02 dustymc

wondering if ya'll wouldn't be happier if all of "higher geography" was something with which you could do WHATEVER, and the spatial stuff something completely separate

I think this is sorta true, but a little more nuanced. IMO we are squishing together two things that maybe shouldn't be squished together.

Textual descriptions of locality (even when they MIGHT have associated footprints) are generally subject to the whims of humans and nature. Today California is one shape, in 100 years, some of it has succumbed to the Pacific Ocean, and it is another. 34.0549, -118.2426 along with a datum is the same spot on the planet, no matter what - right? Making attempts to match those with the textual description and ignoring the time component just makes messes.

Perhaps crazy idea - the only REAL localities are 34.0549, -118.2426 along with a datum, everything else is an assertion that requires a time component to interpret properly (the datum is actually the time component for the coordinates), without it, even that point on the planet is unknowable. I think the idea that locality can be divested of the time component is what causes most of our troubles. I also think that we want to be able to assert things, even if they end up looking crazy and there shouldn't be any reason we cannot do both, it just needs to be clear what is what.

Sorry for the off-topic rant. If someone wants to say 34.0549, -118.2426 has Waterbody of Pacific Ocean, that is their prerogative. I might rely on the words, but more likely, I would rely on the coordinates and either way I might be mislead as both were entered by fallible humans. I do think people are looking for ways to make sure they don't make up "geography" so a code table format to ensure that I don't make a nonexistent combination of state and county is useful. But in a way - isn't that just an attribute of 34.0549, -118.2426 as much as Waterbody?

Jegelewicz avatar Feb 06 '24 16:02 Jegelewicz

someone wants to say 34.0549, -118.2426 has Waterbody of Pacific Ocean, that is their prerogative

That is (I think) what this Issue is about.

I would rely on the coordinates

I definitely don't give assertions much clout, and would encourage anyone running analyses to do the same! And hopefully the other stuff doesn't just look like 'Arctos is broken' to the world....

time component

That's probably OK as a string (assert whatever you want), impossible under a controlled vocabulary ("France" has covered most of the planet over time - at least according to the French, a point which must clearly also be included), but is just trivial under a spatial perspective (toss it at whatever GIS layer you want). So now I'm wondering if trying to control this at all (or add complexity in any way) isn't a mistake....

dustymc avatar Feb 06 '24 16:02 dustymc

someone wants to say 34.0549, -118.2426 has Waterbody of Pacific Ocean, that is their prerogative That is (I think) what this Issue is about.

Yes!

I would rely on the coordinates

Of course, but sometimes the geographic data of a specimen is just "Pacific Ocean", and geolocating that doesn't do a whole lot (and could break Arctos, maybe?). Being able to assert waterbody=Pacific Ocean (and, maybe one day, have that waterbody carry a shape as well) for records both with and without coordinates and/or detail textual geographic data is very useful curatorially

genevieve-anderegg avatar Feb 06 '24 16:02 genevieve-anderegg

geographic data of a specimen is just "Pacific Ocean", and geolocating that doesn't do a whole lot

Hard disagree from me. Maybe adding explicit point-radius error would make messes (and loop in a lot of land), but just the bare coordinates (which can ONLY defensibly be interpreted as 'unknown error' and NEVER as 'infinitely precise') would do it, or just use the shape.

dustymc avatar Feb 06 '24 16:02 dustymc

Maybe adding explicit point-radius error would make messes (and loop in a lot of land)

Yeah either you drop a point in the middle with an error radius that includes a ton of land (islands, all kinds of shoreline), it really isn't ideal. Didn't you say that would be really taxing on Arctos? (could be misremembering from yesterday)

or just use the shape.

Which would be ideal.

I think this issue is a good stepping stone to getting there.

genevieve-anderegg avatar Feb 06 '24 17:02 genevieve-anderegg

really taxing on Arctos?

And people - but you don't need the radius. Just use the point.

good stepping stone to getting there

That's been around for a few years? (The option to "just use geog shape" may be disabled at the moment, but that's a tech issue - complain, loudly, to @mkoo !)

dustymc avatar Feb 06 '24 17:02 dustymc

The only way I would get on board with this proposal is if the "watery" components of higher geography are removed and there is only one place to assert bodies of water. Allowing the same term in two places will only confuse and de-normalize.

Once again, I feel the lack of a definition for higher geography is coming to bite us. Because things in higher geography can overlap, there are places where selecting it is an impossible choice. If higher geography in Arctos were limited to political boundaries, then it would be clear that Pacific Ocean belongs elsewhere. If all we do is add an attribute - at least half of those doing data entry will never use it because they already selected the same term as part of higher geography. No matter how hard anyone tries to avoid this it will happen.

The things from IHO Seas that are currently in higher geography could be moved to geographic features.

Jegelewicz avatar Feb 06 '24 20:02 Jegelewicz

Allowing the same term in two places

Well maybe, but the data are about as different as they can be so I don't think anyone who's doing much analysis of searching with any sort of precision is going to have much trouble telling strings from shapes.

higher geography in Arctos were limited to political boundaries

That's not the conclusion of the geo group - Arctos is limited only to "authorities" and can't/won't/doesn't care about the contents of those authorities. The only real question is what constitutes an authority, but I'll support most anything that's 'published' somewhere.

things in higher geography can overlap, there are places where selecting it is an impossible choice.

OK, I'd probably oppose any "authority" that didn't have some sort of structure, but so far that's not been an issue and I don't think this is actually a consideration.

If all we do is add an attribute - at least half of those doing data entry will never use it because they already selected the same term as part of higher geography.

Yes, any sort of independent assertion is going to be arbitrarily used. (That's precisely what I was trying to avoid with the 'georeference everything and let Arctos figure this out' approach that was so soundly rejected a couple years back.)

The things from IHO Seas that are currently in higher geography could be moved to geographic features.

No, they cannot. They're not the same kind of data, locality attributes cannot support what's in geography. We could add the terms - not the data - to https://arctos-test.tacc.utexas.edu/info/ctDocumentation.cfm?table=ctlocality_attribute_type#feature / https://arctos-test.tacc.utexas.edu/info/ctDocumentation.cfm?table=ctfeature instead of something new if that serves the purpose.

dustymc avatar Feb 06 '24 21:02 dustymc

No, they cannot.

Again we are running into assertable features and "fun flexible features? We need to call these something different.

Jegelewicz avatar Feb 06 '24 21:02 Jegelewicz

What is the difference between adding "water body" with IHO/Marine Region listed entries in the dropdown table vs. our current "drainage" which has a similar (and somewhat overlapping) dropdown table? What requirements did the items in the dropdown meet for "drainage" except that a collection needed that entity to accurately describe the locality? If the Arctic Ocean is a drainage, isn't it also a water body?

Screenshot 2024-02-07 at 7 20 12 AM

sharpphyl avatar Feb 07 '24 14:02 sharpphyl

Arctic Ocean drainage==everything that drains into the ocean (and none of the ocean??)

Arctic Ocean waterbody==all of the ocean, none of anything else

What requirements did the items in the dropdown meet for "drainage"

We don't have spatial layers (so I - along with everybody else - am left guessing).

dustymc avatar Feb 07 '24 15:02 dustymc

When we originally requested drainage be added for our fish collection, the values we needed to migrate included both text terms (Rio Grande Drainage) and HUC terms for terrestrial drainages. https://water.usgs.gov/GIS/huc.html

campmlc avatar Feb 07 '24 15:02 campmlc

Ok, the overlap makes sense. It looks like the drainage table is not assertable either but has an accepted source as Mariel identified. Is using the IHO table for the water body dropdown a similar process?

sharpphyl avatar Feb 07 '24 19:02 sharpphyl

HUC

https://github.com/ArctosDB/arctos/issues/5223 could use some resolvin'

IHO table for the water body dropdown a similar process

Yes and no, I think. @Jegelewicz excellent point about normalization applies. Unlike watershed, for which we could find no relevant spatial data, we clearly have these spatial data, and there's a special dirty word just for scattering like things around in different-but-not-really baskets....

I no longer have no idea if we're solving an administrative problem or making a huge mess of the data here, it's seemed like both from time to time.

dustymc avatar Feb 07 '24 20:02 dustymc

Having a waterbody locality attribute that is just the terms, and not the associated shapes, of waterbodies will help our collections begin to record our data in the way we want to. If one day all our wishlist items are met (#5597, #6521), then these attributes could be used to apply the shapes/geography associated with them to our catalog records (and all the intersects and edge cases can be resolved with an updated model/increased spatial capabilities). I think we should either go forward with this, because it will help in the meantime while we wait for Arctos to have better spatial capabilities.

solving an administrative problem or making a huge mess of the data here

It helps our administrative problems and I think sets us up for future successes with geography data once improvements to Arctos are made

genevieve-anderegg avatar Feb 07 '24 22:02 genevieve-anderegg

Need source-based definition: "appears in list at url" is ideal. (Anything in https://www.marineregions.org/ ? Some subset of https://www.marineregions.org/ ?)

dustymc avatar Mar 04 '24 20:03 dustymc

Definition

  1. called "waterBody" 2) use https://www.marineregions.org/ as source for code table requests 3) must be spatially explicit (even though the shape will not be linked with these terms)

genevieve-anderegg avatar Mar 04 '24 20:03 genevieve-anderegg

What's 'must be spatially explicit' mean? (Can you maybe find an example that's OK/should be approved for the code table and one that's not?)

Also - should this be limited to water? If so, how? Eg https://www.marineregions.org/gazetteer.php?p=details&id=32570 exists and as far as I can tell refers to the port, not the water. (And it's a point - is that 'spatial' and, either way, is there some way to tell other than zooming ?)

I still don't care what the answers to any of those questions are, I'd just like to have some formula such that that when someone requests eg Agios Efstratios (or whatever) be added, we can all understand what the answer will be. (That one does seem to be 'spatial,' but a void in a waterbody rather than a waterbody...)

https://www.marineregions.org/gazetteer.php?p=details&id=19219 is a waterbody but not (exactly) spatial.

https://www.marineregions.org/gazetteer.php?p=details&id=8501 is someone's weird division?? I think from a publication, https://www.researchgate.net/figure/Brattegard-Holthe-1997-divided-the-Norwegian-coast-in-26-different-sectors-to-map-the_fig1_257733145

https://www.marineregions.org/gazetteer.php?p=details&id=64250 is in water...

(I searched 'rat' and clicked the interesting stuff...)

I think we need to narrow this down, somehow?

dustymc avatar Mar 05 '24 13:03 dustymc

What's 'must be spatially explicit' mean?

The name that someone wishes to add to this table must correspond to a shape in Marine Regions (even though the shape will not itself be added)

I think we need to narrow this down, somehow?

Yes agreed. Could we narrow it down by the category of the Marine Regions "Placetype? category?

image

They have many options (n=327): Marine Regions Placetype Terms.xlsx

Maybe OGL and us at DMNS can go through and identify Placetype term categories that have place shapes we want for our list of terms? That should cut out a lot of the weird things in Marine Regions. Then we can say that if we want to add a name from another Placetype category it has to go to the committee for discussion?

Also, when identifying the names we want, maybe should work backwards from the DWC term for waterbody so that all of the terms we want added on this proposed code table match the DWC waterbody definition , so that once we can link shapes to records in the way we want, everything is easily convertable

genevieve-anderegg avatar Mar 20 '24 17:03 genevieve-anderegg

go to the committee for discussion

I don't like that because it's arbitrary. If at all possible, it would be useful to have some sort of go/nogo test for these where we're just following some bigger-than-Arctos "standard" rather than discussing every term.

match the DWC waterbody definition

Then let's do that instead of this??

appropriate vocabularies include: HydroLAKES (https://www.hydrosheds.org/page/hydrolakes), a database aiming to provide the shoreline polygons of all global lakes with a surface area of at least 10 ha, the 'water body' term in the OBO ontology (http://purl.obolibrary.org/obo/ENVO_00000063) and, for marine water bodies, IHO Sea Areas (http://www.vliz.be/en/imis?dasid=5444&doiid=323)

dustymc avatar Mar 20 '24 17:03 dustymc

I don't like that because it's arbitrary. If at all possible, it would be useful to have some sort of go/nogo test for these where we're just following some bigger-than-Arctos "standard" rather than discussing every term

Totally agree

Then let's do that instead of this??

I think doing something that follows the definition would steer us in the right direction. The definition does say: "Suggestions for appropriate vocabularies include: HydroLAKES (https://www.hydrosheds.org/page/hydrolakes), a database aiming to provide the shoreline polygons of all global lakes with a surface area of at least 10 ha, the 'water body' term in the OBO ontology (http://purl.obolibrary.org/obo/ENVO_00000063) and, for marine water bodies, IHO Sea Areas (http://www.vliz.be/en/imis?dasid=5444&doiid=323)."

Emphasis mine. So the DwC definition does not lay out a hard and fast rule, but rather suggestions for following datasets (which was our plan, yay!). So, what if we follow the lead of the definition and choose a few specific datasets to include? Suggestions that would be useful for us at DMNS (and not full of clutter): HydroLakes and IHO Sea Areas (from above), and EEZs. Maybe there are some specific lists (PlaceTypes) from Marine Regions that would be useful too?

What do we think? Would we rather make this a very broad field? @Jegelewicz @mkoo @happiah-madson @ekrimmel @falco-rk

genevieve-anderegg avatar Mar 20 '24 19:03 genevieve-anderegg

choose a few specific datasets to include?

I think this sounds like a good idea. We don't have a single authority for geography, why should we expect to find a single one here? Let's just make our list, then any terms that can be found in those authorities can be added to this code table without the need for all the box checking (but we should still make Github issues so that we can keep track of things).

Jegelewicz avatar Mar 20 '24 22:03 Jegelewicz

HydroLAKES (https://www.hydrosheds.org/page/hydrolakes), a database aiming to provide the shoreline polygons of all global lakes with a surface area of at least 10 ha

marine water bodies, IHO Sea Areas (http://www.vliz.be/en/imis?dasid=5444&doiid=323).

These seem like good places to start. HydroLAKES is a data download so maybe we just add the whole thing?

the 'water body' term in the OBO ontology (http://purl.obolibrary.org/obo/ENVO_00000063)

This isn't a list of waterbodies....

Jegelewicz avatar Apr 01 '24 19:04 Jegelewicz

From last geography meeting: Refining the waterbody code table definition (referencing DarwinCore definition):

  1. table is named "waterBody"
  2. requested values must refer to vocabularies in the following datasets that are associated with a waterbody shape file (while the shape file itself will not be included in the code table)
  3. values can be added from the following datasets: HydroLAKES (https://www.hydrosheds.org/page/hydrolakes), which contains shoreline polygons for lakes larger than 10 hectares; IHO Sea Areas (http://www.vliz.be/en/imis?dasid=5444&doiid=323) for marine water bodies.

Edits/thoughts welcome, barring calls for larger discussion (#7666)...

genevieve-anderegg avatar Apr 15 '24 22:04 genevieve-anderegg

table is named "waterBody"

hu?

IHO Sea Areas

I'm still not crazy about that, those are already in Arctos as assertable geography, https://arctos.database.museum/place.cfm?sch=geog&valid_catalog_term_fg=1&geog_remark=iho. Having two ways of saying the same thing should require some (very!) special justification. (Or someone can perhaps explain to me how it's not the same and I'm just lost??)

dustymc avatar Apr 15 '24 22:04 dustymc

hu?

"waterbody" would be the name of the locality attribute, sorry for the confusion !

two ways of saying the same thing

Yeah not ideal, but if you need to say your shell was collected in the water off of Maine (for permitting/legal purposes) as well as the North Atlantic, then you could use Maine for your asserted higher geography and put North Atlantic in an attribute (per this request). Might need to tighten up the definition so we don't end up creating justification for having all the geo data in two places. But there really isn't a great way to store both the administrative and non-administrative watery geo data right now for all the needs these data need to meet (#7660). Maybe this goes on the work group/issues agenda...?

genevieve-anderegg avatar Apr 15 '24 23:04 genevieve-anderegg