taxonworks icon indicating copy to clipboard operation
taxonworks copied to clipboard

Model(?) - As a user want a low-res presence/absence map OTUs

Open mjy opened this issue 2 years ago • 17 comments

Addresses the need for TaxonPages API call.

  • The map uses a base-set tile, likely Natural earth, to level one subdivisions as shapes
  • The map areas have 2 states, on, off
  • Rasters are an option to consider
  • Calculating on the postgis back end is an option to consider, will have to benchmark.

Considerations:

  • Use raster?
  • Store raster, compute on demand?
  • Store raster, cache in background?
  • For large underlying result sets we can limit to n queries of limit 1 where n is the possible areas fill.
  • For small underlying result sets we can simply enumerate all intersections?

mjy avatar Jun 01 '22 19:06 mjy

Let the backend only return the natural earth shape NAMES and have the frontend pick the polygons up from an assets server? Could probably be vector data like a modern equivalent of jvectormap so we don't need a tile server for multiple zoom levels.

LocoDelAssembly avatar Jun 07 '22 18:06 LocoDelAssembly

Let the backend only return the natural earth shape NAMES

I think this is definitely worth pursuing. We need a string summary anyways, for human readable consumption at some level, so this approach would have twice the value.

mjy avatar Jun 07 '22 18:06 mjy

https://observablehq.com/@d3/world-choropleth https://github.com/topojson/world-atlas https://d3-graph-gallery.com/graph/choropleth_basic.html

I'm gethering that It might be hard to do NAME with confidence to state/province level though.

https://www.mapchart.net/world.html. See option to turn-on boundaries.

I wonder if hex-bin approach might be better for us.

mjy avatar Jun 07 '22 18:06 mjy

A lot of these shapes are VERY detailed and the WKT would be very voluminous. A list of names or codes would definitely be human friendly. WKB is denser, but maybe practical for client side compositing.

jrflood avatar Jun 07 '22 18:06 jrflood

I would think both names/codes and WKB together would be better than sending the former and then the client requesting the latter. The JSON can pair these nicely and the request could optionally specify which data is needed.

jrflood avatar Jun 07 '22 18:06 jrflood

A lot of these shapes are VERY detailed.

The NE shapes are not, they are optimized for exactly this kind of thing. This is very much a simplify and send exercise. Nobody is proposing to send original data.

There are algorithimic considerations for building the list to send.

If we have n data points (geospatial shapes) and m possible map target shapes when n >> m, or m >> n we can optimize obvious intersection checks etc.

My gut feeling is that we should, at minimum, start by identifying the queries, per OTU, that result in native SQL lists of GeographicItem shapes. From that we can explore ST_Simplify, merging, etc. in native SQL. For large n only when we have a reduced geometry should we then intersect it with m. For small n we might as well just loop and calculate the intersections to get the list.

Since we already persist the NE shapes we need on our backend we should be able to build some very efficient queries to calculate the shape(s) to send back. It could just be one merged polygon run through with simplify.

mjy avatar Jun 07 '22 19:06 mjy

What is the hierarchical scope above genus? I'm apparently missing an implicit important attribute of this function. if the limit is for example order, then the map would be the whole world probably.

jrflood avatar Jun 07 '22 19:06 jrflood

There is no metadata in the map, it is on/off. All data are gathered for the current OTU and below.

In practice this is simply a join like otus: {id: Otu.coordinate_otus(@otu)} IIRC.

mjy avatar Jun 07 '22 19:06 mjy

Aside from shape files, geographic items also theoretically can be various other types, from point to geometry collections. In my local database only points and multi polygons exist, although there is discrepancy in the counts. I am assuming other types should be depicted.

jrflood avatar Jul 22 '22 16:07 jrflood

We need to render any of the types that we support.

mjy avatar Jul 22 '22 16:07 mjy

Snapshot observations on functional modularity, post experimentations: In the mid- to low-level function, we want to fetch (unique) geographic items for a list of OTUs, which could be a discrete functional level. Additional levels of specialization would be e.g., specific criteria for OTU selection, such as all descendants below one OTU, or amalgamation of unique geographic items into the minimal set of item types (i.e., combining sets of overlapping polygons into single perimeter polygons). The aforementioned first two functions are partially extant (limited) in the api call .../otus/{id}/inventory/distribution, which emits geoJSON.

In the issue #3010 title the implied undesirable characteristics of .../inventory/distribution are polygon repetition (and concomitant intensification). A third party renderer shows a longitudinal offset to the base map as well. This offset anomaly is not observed in the geojson.io site, so probably not intrinsically our data issue.

jrflood avatar Jul 27 '22 20:07 jrflood

Screen Shot 2022-07-29 at 10 10 08 AM

jrflood avatar Jul 29 '22 15:07 jrflood

Above is exhibiting undesirable (?) intensification from sub-areas, and also has several duplicate georeferences (which have no visual artifacts ). Starting to experiment with simplify for overlapping multi polygons in a geometry collection.

jrflood avatar Jul 29 '22 15:07 jrflood

Above used the API for genus Rhipipteryx, copying geoJSON from result to geojson.io

jrflood avatar Jul 29 '22 16:07 jrflood

If the overarching intent is to render only non-overlapped geographic areas, it should be possible to eliminate some items from the query results by using the area hierarchy. If the containing area (ancestor_id present) of this area is present then omit it. This would filter on the array of geographic area/item IDs. This does not cover the case where areas are adjacent, but it's not clear that areas reduced in this way will present any performance problems. Also there may be some user benefit to rendering the boundaries, especially if we decide to articulate the areas with hover/tooltips.

jrflood avatar Aug 04 '22 18:08 jrflood

From your comments today, it sounds like you want to add an additional constraint to not depict points per se, but to instead convert them to their "most common denominator" Natural Earth area?

jrflood avatar Aug 05 '22 16:08 jrflood

And do not depict county level geographic areas. Bring everythig up to the state level. I would suggest to shade both state and country areas.

proceps avatar Aug 05 '22 16:08 proceps

I want to use ST_Union on the equivalent result of /api/v1/otus/{otu_id}/inventory/distribution (mocked currently as .../inventory/presence in branch 3010_presence_map ) AS a postGIS geometryCollection. Since the result in /distribution is constructed as geoJSON, should I translate the geographic object constructions to the postGIS syntax ... OR should I use the ST_GeomFromGeoJSON function on the resultant geoJSON ? While the execution time is questionable, this should produce the minimal geometry. Unfortunately, ST_GeomFromGeoJSON seems to only accept simple geometry types.

jrflood avatar Nov 30 '22 04:11 jrflood

Sounds like you're at the point where you want to try more data?

I think using the resultant geoJSON would be inefficient as we'd have to go to and from the database. I think we want to get the native data "behind" the geoJSON and work with that before it's geoJSON?

native -> compile -> geoJson -> native -> fn() <- natve 2x is a big red flag

mjy avatar Nov 30 '22 14:11 mjy

SELECT ST_AsGeoJSON(ST_Union(ST_GeomFromGeoJSON('{"type":"Point","coordinates":[-48.23456,20.12345]}'), ST_GeomFromGeoJSON('{"type": "Point","coordinates":[-80.2499,-1.4165,0]}'))) As wkt; succeeds with "{""type"":""MultiPoint"",""coordinates"":[[-48.23456,20.12345],[-80.2499,-1.4165]]}"

I originally swung for the fences by pasting the API output from .../api/v1/otus/{id}/inventory/distribution

Of course this is exactly the case you red flagged above, but I'm (re?)gaining insight into the nuances among postGIS WKT, geoJSON, and our geo_X tables.

jrflood avatar Nov 30 '22 21:11 jrflood

A migration will be need to handle the following requirements:

CREATE EXTENSION postgis_raster;
SET postgis.gdal_enabled_drivers = 'ENABLE_ALL';

Note that the latter works, but might be overkill.

mjy avatar Nov 30 '22 22:11 mjy

Following above, this enabled for the server side:

ALTER SYSTEM SET postgis.gdal_enabled_drivers TO 'ENABLE_ALL';
SELECT pg_reload_conf();

Does require further exploration.

mjy avatar Nov 30 '22 22:11 mjy

Overview/review of presence map approach

Consolidate asserted distributions, collection object and type material geographic information for an OTU and its coordinate or subordinate OTUs into a monochrome shaded map. Existing maps are potentially comprised of a huge number of shapes and may have undesirable shading density artifacts.

Use a method similar to the otus_helper/otu_distribution to determine areas to be displayed.

A method for associating a geographic area/item with point data is necessary since these would otherwise tend to dominate maps. A method for finding the smallest (lowest level) area containing the point should be developed.

Geographic areas should be selected distinct, and their intrinsic hierarchy should then be used to remove any subordinate areas. There is no reason to multiply process shapes, since we want a monochrome shaded area.
Since we iterate OTUs in separate loop sections, we might want to use a temporary table constructor or an array to collect area IDs, then select the geographic_items from the distinct geographic_area IDs.

Is there a convenient way to uniquify a Ruby array ?

The reduced to the minimum geographic items for these areas could either be: rendered as geoJSON polygons (similar to current collected areas) or collected using ST_Collect and then either sent as geoJSON or rasterized.

Rasterizing seems to require a compatible client control for the png to be rendered.

jrflood avatar Dec 01 '22 21:12 jrflood

There are many distributions that are very specific from Species Files (which are of course TDWG-based), but our TW-imported distributions don't reflect this. Although our goal in this task is to not display below the state level, it would be my opinion that this informations should be accessible, at least through the OTU.

Since Dmitry has raised the issue that TDWG shapes are effectively obsolete, and we have apparently currently disregarded the TDWG distributions from the maps, do we need a TDWG to GADM translation, either as part of the import or post import?

For user-defined shapes, should there be a similar method to conform shapes to non-TDWG encoding, or just subsume them in an ST_Collect ?

jrflood avatar Dec 16 '22 17:12 jrflood

and we have apparently currently disregarded the TDWG distributions from the maps

Untrue. They have shapes and are treated like any other.

mjy avatar Dec 16 '22 17:12 mjy

Closing as complete. We'll open individual issues to handle ongoing issues.

mjy avatar Sep 05 '23 18:09 mjy