specification icon indicating copy to clipboard operation
specification copied to clipboard

History / time series API

Open tkurki opened this issue 8 years ago • 10 comments

tkurki avatar Sep 16 '15 06:09 tkurki

We need an API that provides ways to

  • discover what time series are available => present a list of for example dates / time ranges for which we have data
  • a way to query & summarize the data, for example

Soundings for a date

  • September 15
  • 5 minute resolution
  • position (average), depth (min/max/avg)

Depth data for an area

  • all data within a geofence
  • position+depth, all data

True wind history

  • last 2 hrs
  • true wind direction and speed, 1 minute average

Events history

  • annotated events created by user inputs

There are undoubtedly countless others.

Here the schema is less important than the query language. For example the current delta format in an array or something similar would work pretty nicely - a collection of data items with timestamps.

tkurki avatar Sep 16 '15 06:09 tkurki

From slack discussion: Basically we can store a list of {"timestamp":"123456", "path":"y.z.x", "value":"nnn"} in a persistent db. That would provide a place for history outside the signalk model, which is really the 'current' data.

Then I would propose /signalk/v1/history/[context]/[path]?start=xxxxx&end=yyyyyy

So as similar as possible to the REST api, but with its own url. Then we can advertise it if the server supports it. And an equivalent signalk message for other transports:

{"context":"vessels.self","history":{"start":nnnnnnn, "end":yyyyyyyy, "path":"x.y.z"}}

The reply should be either a series of update messages, or a list of the stored objects
{"timestamp":"123456", "path":"y.z.x", "value":"nnn"} Seems the updates would be more verbose but handy for replays etc.

rob42 avatar Sep 19 '15 22:09 rob42

Your {"timestamp":"123456", "path":"y.z.x", "value":"nnn"} look awfully lot like a delta message. I think there is value in using one format for both delta stream and history query result. Also a co moremplex use case like position and speed for the racing fleet, 5 minute average would fit pretty well if not optimally to the result being a list of deltas.

I would not put context and path into the url path. How do you query for multiple values? It implies a hierarchy, which is not there. They are also somewhat orthogonal: context is part of the where clause and path a part of the select clause in sql terms.

tkurki avatar Sep 20 '15 07:09 tkurki

Food for thought: http://eagleio.readthedocs.org/en/latest/reference/historic/jts.html

tkurki avatar Apr 20 '16 15:04 tkurki

Food for thought: http://eagleio.readthedocs.org/en/latest/reference/historic/jts.html

https://github.com/james-hu/cjtsd-js/wiki/Compact-JSON-Time-Series-Data

tkurki avatar Apr 20 '16 15:04 tkurki

https://github.com/michaelwittig/fliptable/blob/master/README.md

tkurki avatar Apr 20 '16 15:04 tkurki

@vyacht did you have time to look at the resources linked above? Your thoughts on the history api?

tkurki avatar Apr 21 '16 17:04 tkurki

Sure, they were the first 3 google hits and I looked at them last week. jts looks natural. Compact format seems a bit awkward - I think you want something more human readable. {"timestamps":[],"values":[]} is a bit weird as well, separating things that clearly belong together. Other references are GPX, KML for how to present tracks and data collected along the way.

I have been testing more approaches and have a prototype graph running. I seem to always end up with something close to my PR. Moved time references directly under vessel for fun and using arrays "vessel":[{"navigation":{}},{"navigation":{}}]. Basically generating a track of all data for each boat as an array.

Some thoughts

  1. Switch from historic data to most recent updates - this comes in handy when you want to draw a depth graph, pull the history and then continue with the new data
  2. Pagination
  3. Data often makes more sense in a context of multiple values and not only time (e.g. depth at position).
  4. "Aggregates" (max, min, mean), various periods (historic, last, next, last1h, last24h) and also boxing data (max and min lat covered by data) for e.g. tidal data, speed, wind speed
  5. and 2. are covered by the above. Better to use wording like "startAfter" instead of "start" for pagination.
  6. could be solved by "vessel":[{"navigation":{}},{"navigation":{}}]
  7. remains difficult

Approaches for query APIs are sufficiently well understood these days. REST example:

?startAfter=xyz&fields=navigation.depth,navigation.position

vyacht avatar Apr 21 '16 17:04 vyacht

https://docs.influxdata.com/influxdb/v1.0/guides/querying_data/

Pushing SK numeric data to InfluxDB is pretty straightforward and visualisation via http://grafana.org/ is pretty easy.

tkurki avatar Sep 30 '16 18:09 tkurki

The artemis-server stores all signalk data directly in a timeseries db (currently influx) natively, and uses realtime query of the data for all output. See https://github.com/SignalK/artemis-server

The current behaviour uses latest timestamp for any requested key which is consistent with other signalk servers behaviour. However there is no reason not to request data at any point in the past.

As noted by @vyacht we can send the request easily by REST or signalk subscribe or get message.

This would just require a new parameter (timeAt?)to pass the requested time in the request and still be fully compatible with existing signalk output. Since we already include timestamp in messages the resulting message would just have a timestamp in the past.

Requesting values across a time span can be done in a similar way with timeFrom and timeTo parameters. I think the delta format will allow these to be returned, since each value has a timestamp.

From work with grafana it seems we would also need an interval (or resolution #363 ) key, so the data can be summarised to 1sec, 1 minute, etc. We dont want a 10Hz value over 30 days!.

That would also require an 'aggregate',key eg avg, mean, max, min etc,

Using existing formats may not be the most compact, but does have the advantage of minimal change to existing apps. By incorporating a requestTime widget most could easily move back and forth to any historic point, and by adjusting the subscribe behaviour to begin in the past, they can also do replays.

That leaves the concept of filters or where clause, but I think thats a complex issue, and can be added later.

rob42 avatar Jul 28 '18 00:07 rob42