taxonworks
taxonworks copied to clipboard
Model(?) - As a user want a low-res presence/absence map OTUs
Addresses the need for TaxonPages API call.
- The map uses a base-set tile, likely Natural earth, to level one subdivisions as shapes
- The map areas have 2 states, on, off
- Rasters are an option to consider
- Calculating on the postgis back end is an option to consider, will have to benchmark.
Considerations:
- Use raster?
- Store raster, compute on demand?
- Store raster, cache in background?
- For large underlying result sets we can limit to n queries of limit 1 where n is the possible areas fill.
- For small underlying result sets we can simply enumerate all intersections?
Let the backend only return the natural earth shape NAMES and have the frontend pick the polygons up from an assets server? Could probably be vector data like a modern equivalent of jvectormap so we don't need a tile server for multiple zoom levels.
Let the backend only return the natural earth shape NAMES
I think this is definitely worth pursuing. We need a string summary anyways, for human readable consumption at some level, so this approach would have twice the value.
https://observablehq.com/@d3/world-choropleth https://github.com/topojson/world-atlas https://d3-graph-gallery.com/graph/choropleth_basic.html
I'm gethering that It might be hard to do NAME with confidence to state/province level though.
https://www.mapchart.net/world.html. See option to turn-on boundaries.
I wonder if hex-bin approach might be better for us.
A lot of these shapes are VERY detailed and the WKT would be very voluminous. A list of names or codes would definitely be human friendly. WKB is denser, but maybe practical for client side compositing.
I would think both names/codes and WKB together would be better than sending the former and then the client requesting the latter. The JSON can pair these nicely and the request could optionally specify which data is needed.
A lot of these shapes are VERY detailed.
The NE shapes are not, they are optimized for exactly this kind of thing. This is very much a simplify and send exercise. Nobody is proposing to send original data.
There are algorithimic considerations for building the list to send.
If we have n
data points (geospatial shapes) and m
possible map target shapes when n >> m, or m >> n we can optimize obvious intersection checks etc.
My gut feeling is that we should, at minimum, start by identifying the queries, per OTU, that result in native SQL lists of GeographicItem shapes. From that we can explore ST_Simplify, merging, etc. in native SQL. For large n
only when we have a reduced geometry should we then intersect it with m
. For small n
we might as well just loop and calculate the intersections to get the list.
Since we already persist the NE shapes we need on our backend we should be able to build some very efficient queries to calculate the shape(s) to send back. It could just be one merged polygon run through with simplify.
What is the hierarchical scope above genus? I'm apparently missing an implicit important attribute of this function. if the limit is for example order, then the map would be the whole world probably.
There is no metadata in the map, it is on/off. All data are gathered for the current OTU and below.
In practice this is simply a join like otus: {id: Otu.coordinate_otus(@otu)}
IIRC.
Aside from shape files, geographic items also theoretically can be various other types, from point to geometry collections. In my local database only points and multi polygons exist, although there is discrepancy in the counts. I am assuming other types should be depicted.
We need to render any of the types that we support.
Snapshot observations on functional modularity, post experimentations: In the mid- to low-level function, we want to fetch (unique) geographic items for a list of OTUs, which could be a discrete functional level. Additional levels of specialization would be e.g., specific criteria for OTU selection, such as all descendants below one OTU, or amalgamation of unique geographic items into the minimal set of item types (i.e., combining sets of overlapping polygons into single perimeter polygons). The aforementioned first two functions are partially extant (limited) in the api call .../otus/{id}/inventory/distribution, which emits geoJSON.
In the issue #3010 title the implied undesirable characteristics of .../inventory/distribution are polygon repetition (and concomitant intensification). A third party renderer shows a longitudinal offset to the base map as well. This offset anomaly is not observed in the geojson.io site, so probably not intrinsically our data issue.
Above is exhibiting undesirable (?) intensification from sub-areas, and also has several duplicate georeferences (which have no visual artifacts ). Starting to experiment with simplify for overlapping multi polygons in a geometry collection.
Above used the API for genus Rhipipteryx, copying geoJSON from result to geojson.io
If the overarching intent is to render only non-overlapped geographic areas, it should be possible to eliminate some items from the query results by using the area hierarchy. If the containing area (ancestor_id present) of this area is present then omit it. This would filter on the array of geographic area/item IDs. This does not cover the case where areas are adjacent, but it's not clear that areas reduced in this way will present any performance problems. Also there may be some user benefit to rendering the boundaries, especially if we decide to articulate the areas with hover/tooltips.
From your comments today, it sounds like you want to add an additional constraint to not depict points per se, but to instead convert them to their "most common denominator" Natural Earth area?
And do not depict county level geographic areas. Bring everythig up to the state level. I would suggest to shade both state and country areas.
I want to use ST_Union on the equivalent result of /api/v1/otus/{otu_id}/inventory/distribution (mocked currently as .../inventory/presence in branch 3010_presence_map ) AS a postGIS geometryCollection. Since the result in /distribution is constructed as geoJSON, should I translate the geographic object constructions to the postGIS syntax ... OR should I use the ST_GeomFromGeoJSON function on the resultant geoJSON ? While the execution time is questionable, this should produce the minimal geometry. Unfortunately, ST_GeomFromGeoJSON seems to only accept simple geometry types.
Sounds like you're at the point where you want to try more data?
I think using the resultant geoJSON would be inefficient as we'd have to go to and from the database. I think we want to get the native data "behind" the geoJSON and work with that before it's geoJSON?
native -> compile -> geoJson -> native -> fn()
<- natve 2x is a big red flag
SELECT ST_AsGeoJSON(ST_Union(ST_GeomFromGeoJSON('{"type":"Point","coordinates":[-48.23456,20.12345]}'), ST_GeomFromGeoJSON('{"type": "Point","coordinates":[-80.2499,-1.4165,0]}'))) As wkt; succeeds with "{""type"":""MultiPoint"",""coordinates"":[[-48.23456,20.12345],[-80.2499,-1.4165]]}"
I originally swung for the fences by pasting the API output from .../api/v1/otus/{id}/inventory/distribution
Of course this is exactly the case you red flagged above, but I'm (re?)gaining insight into the nuances among postGIS WKT, geoJSON, and our geo_X tables.
A migration will be need to handle the following requirements:
CREATE EXTENSION postgis_raster;
SET postgis.gdal_enabled_drivers = 'ENABLE_ALL';
Note that the latter works, but might be overkill.
Following above, this enabled for the server side:
ALTER SYSTEM SET postgis.gdal_enabled_drivers TO 'ENABLE_ALL';
SELECT pg_reload_conf();
Does require further exploration.
Overview/review of presence map approach
Consolidate asserted distributions, collection object and type material geographic information for an OTU and its coordinate or subordinate OTUs into a monochrome shaded map. Existing maps are potentially comprised of a huge number of shapes and may have undesirable shading density artifacts.
Use a method similar to the otus_helper/otu_distribution to determine areas to be displayed.
A method for associating a geographic area/item with point data is necessary since these would otherwise tend to dominate maps. A method for finding the smallest (lowest level) area containing the point should be developed.
Geographic areas should be selected distinct, and their intrinsic hierarchy should then be used to remove any subordinate areas. There is no reason to multiply process shapes, since we want a monochrome shaded area.
Since we iterate OTUs in separate loop sections, we might want to use a temporary table constructor or an array to collect area IDs, then select the geographic_items from the distinct geographic_area IDs.
Is there a convenient way to uniquify a Ruby array ?
The reduced to the minimum geographic items for these areas could either be: rendered as geoJSON polygons (similar to current collected areas) or collected using ST_Collect and then either sent as geoJSON or rasterized.
Rasterizing seems to require a compatible client control for the png to be rendered.
There are many distributions that are very specific from Species Files (which are of course TDWG-based), but our TW-imported distributions don't reflect this. Although our goal in this task is to not display below the state level, it would be my opinion that this informations should be accessible, at least through the OTU.
Since Dmitry has raised the issue that TDWG shapes are effectively obsolete, and we have apparently currently disregarded the TDWG distributions from the maps, do we need a TDWG to GADM translation, either as part of the import or post import?
For user-defined shapes, should there be a similar method to conform shapes to non-TDWG encoding, or just subsume them in an ST_Collect ?
and we have apparently currently disregarded the TDWG distributions from the maps
Untrue. They have shapes and are treated like any other.
Closing as complete. We'll open individual issues to handle ongoing issues.