oshdb
oshdb copied to clipboard
request metadata output
opened by @tyrasd
we need some method to access some metadata like the following:
- [x] "freshness" of the imported osm data (and/or minutely update status, see #6)
- [x] source of the data
- [x] license(s) of the returned data (i.e. ODbL and/or cc-by-sa in the case of OSM depending on requested time interval)
- [x] spatial validity/completeness of the data in the oshdb instance (global or extent of regional extract)
- [x] temporal validity/completeness of the returned data (e.g. that OSM's history data before October 2008 is incomplete!)
- [x] general stats (amount of data)
- [x]
maxzoomparameter - [ ] used grid index
- [ ] oshdb version
- [x] keytables-"hash" for #105
- [ ] instead of a String getter Method should return a
OSHDBMetadata-Object to simplify usage. Thereby a user directly knows what fields are available and their format. - …
write getters for these properties in OSHDB class
todo:
- [ ] document existing metadata fields
see https://gitlab.gistools.geog.uni-heidelberg.de/giscience/big-data/ohsome/oshdb/issues/64
Two other variables might vary over time and should not result in a change of code on the client side:
- [ ] oshdb-table-prefix
- [ ] oshdb-keytable-tablename
- [ ] if I see it correctly, there are duplicate information in the metadata:
header.bboxvs.data.bboxvs.extract.regionanddata.timerangevs.extract.timerangefor example. There should be no diffrence because e.g. even if the data.timerange is lanter then the extract.timerange it is actually not. Meaning even if there were no edits in that region for 1d before the extract the data still represents the current state of the date at the moment of the extract.
data.bbox is the calculated bbox over all coordinate, vs extract.region don't need to be a bbox and is this region which was used to extract from the planet file.
But you are true, for queries you should only consider extract region/timerange
- [ ] timestamps should be split in two rows for start and end for better handling
what is the priority of this issue? The prefix and keytables config necessary to start ignite jobs frequent update of otherwise stable config files, especially now with the weekly updates.
In our ignite cluster we haave a special cache called "ohsome". If you want to retrieve the current/newest prefix you can do it like:
try (var ignite = Ignition.start("ohsome-heigit.xml")) {
var ohsome = ignite.<String, String>cache("ohsome");
var prefix = ohsome.get("active");
var APPLICATION_NAME = "MyApplicationName";
var ktUrlTmpl = "JDBC_PROTOCOL://HOST_URL/keytables-%s?ApplicationName=%s";
try (var keytables = getConnection(
format(ktUrlTmpl, prefix, APPLICATION_NAME), "USERNAME", "PASSWORD")) {
var oshdb = new OSHDBIgnite(ignite);
oshdb.prefix(prefix);
OSMEntitySnapshotView.on(oshdb)
.keytables(new OSHDBJdbc(keytables))
...
}
}
Ok that sounds feasable but it seems incompatible with https://gitlab.gistools.geog.uni-heidelberg.de/giscience/big-data/ohsome/helpers/oshdb-database-driver. Wasn't the idea to have exaclty that logic in a dedicated class or method for each backend:
write getters for these properties in OSHDB class
https://github.com/GIScience/oshdb/tree/metadataRework