openhab-core icon indicating copy to clipboard operation
openhab-core copied to clipboard

Enable binding to store historic states

Open altaroca opened this issue 3 years ago • 29 comments

please refer to #844 this adds a method BaseThingHandler.updateHistoricState (ChannelUID channelUID, State state, ZonedDateTime dateTime) and all logic to route the corresponding ItemHistoricStateEvent through the core to the persistence layer and any ModifiablePersistenceService. I have tested it using my fork of the E3DC binding and influxdb modified to implement ModifiablePersistenceService. I made some choices for simplicity sake:

  • the item current state and its historic state only meet in the persistence. Thus the historic state (even if younger) does not affect the current state.
  • the events are separate but share the same publication topic

altaroca avatar Jun 14 '22 16:06 altaroca

wow, that was fast. I must apologize. I won’t be able to react that quickly. Please give me a few days.

altaroca avatar Jun 14 '22 19:06 altaroca

@J-N-K i agree with your proposals and have made changes accordingly.

What happens if the state has a date this is in the future?

that date is treated as any other. In other words, the binding is responsible for providing a correct date. There may be instances where a future date is needed. Maybe some kind of forecast? I cannot think of anything useful now.

altaroca avatar Jun 20 '22 17:06 altaroca

While I like a lot that this topic come to concrete form I suppose that a bit better way would be proper tagging of states instead of multiplication of events. As outlined - bindings can provide estimations which are ahead of time, hence it would require another kind of event.

I think that making a type TimeBoudState[DateTime, State] would better reflect that case because:

  • This could be still a valid state
  • It would allow to get whole thing working with channel-item profiles
  • It can be detected by persistence adapters

Thing is, if core is ready for such step?

splatch avatar Jun 21 '22 22:06 splatch

@splatch I do not fully understand your comment. What do you mean by 'can still be a valid state'? The PR provides the entity you propose. Only difference is it is called HistoricState.

altaroca avatar Jun 22 '22 14:06 altaroca

The PR provides the entity you propose. Only difference is it is called HistoricState.

@altaroca roger, I missed type itself and got lost by changes in other areas. If you rely on the state then main question I could raise is why distinguish situation by introducing ThingCallback.historicStateUpdated(ChannelUID channelUID, State state, ZonedDateTime dateTime) which hide access to this state in bindings. If bindings have access to String/DateTime and other representations, then this part of change, strictly speaking is not required.

I see a point in making such state a first level citizen because some of bindings could provide a timestamp of received update. To give more concrete example - a CAN interface controller can attach timestamp to the message which delivers state update. Binding then knows exact timestamp of an update and it is not a historic state per say as it is instant value with receive time information.

splatch avatar Jun 22 '22 14:06 splatch

@splatch I had thought about that and decided against it. There were two reasons:

  1. sub-classes of State are often checked by using expressions like instanceof DecimalType. This breaks if there was another type hierarchy based on HistoricState like DecimalHistoricType etc. This might be solved but requires a lot of refactoring in the core and in add-ons.
  2. I fear the complexity of deciding which state is the current one. Will a younger HistoricState overwrite the current state? Yes, for timed measurements. No, for forecasts in the same channel. I do not see a clear rule. That's why I kept State and HistoricState separate.

altaroca avatar Jun 23 '22 15:06 altaroca

  1. Why you consider checking and casting for DecimalState as a fine operation, but checking for TimeBoundState as something wrong? It is a state just like others. In case if primitive type is required you can rely on composition such as below
interface WrapperState extends State /*, ComplexType? */ {
  State unwrap();
  // 24.08.2022 - navigate to specific type of state under wrapper
  <T extends State> T unwrap(Class<T> type);
  <T extends State> boolean has(Class<T> type);
}
class TimeBoundState implements WrapperState {
  private DateTime dateTime;
  private State state;
  State unwrap() {
    State myState = state;
    while (myState instanceof WrapperState) {
       // it can be also moved to State/Type itself so DecimalType will always return this, but wrapper states will unpack themselves
       myState = myState instanceof WrapperState ? ((WrapperState) myState).unpack() : myState;
    }
    return myState;
  }
  // ..
}

// for callers doing persistence and other parties which need to process concrete state instance
State actualState = state instanceof WrapperState ? ((WrapperState) state).unpack() : state;
if (actualState instanceof DecimalType) {
   ...
}
// 24.08.2022 - unpacking to a specific type
DecimalType actualState = state instanceof WrapperState ? ((WrapperState) state).unpack(DecimalType.class) : state.as(DecimalType.class);
if (state instanceof DecimalType || (state instanceof WrapperState && ((WrapperState) state).has(DecimalType.class)) {
   DecimalType dec = state instanceof DecimalType ? (DecimalType) state : ((WrapperState) state).unpack(DecimalType.class);
}
  1. If you keep going with a HistoricState then its a matter of question when we will need to start dealing with FutureState, ScheduledState and related framework changes. Obviously these situations may not appear immediately, but what you signal with historic state is a paired information state+timestamp. If you look at it in such way then it doesn't matter if its past, future or almost now. Processing of time bound states might be completely omitted by items, or based on time accuracy criteria. Just like you do it right now via applyHistoricState. Its based on separate event but may be based on state kind iself.

splatch avatar Jun 23 '22 15:06 splatch

  1. Why you consider checking and casting for DecimalState as a fine operation, but checking for TimeBoundState as something wrong?

Err, no. That's not my point. I meant to say that no existing code will understand the meaning of types like DecimalTimeBoundState because of the way everyone uses instanceof on known sub-classes of State. And we would have to refactor classes like InfluxDBStateConvertUtils.

  1. If you keep going with a HistoricState then its a matter of question when we will need to start dealing with FutureState, ScheduledState and related framework changes. Obviously these situations may not appear immediately, but what you signal with historic state is a paired information state+timestamp. If you look at it in such way then it doesn't matter if its past, future or almost now.

Of course. I completely agree. And I do not care about the name. It's up for discussion what's the best name: HistoricState, TimeBoundState or something else.

What I mean is that people, seeing State and HistoricState being passed along as equal in the same channel will start to wonder which one takes precedence if any. Or do you suggest to have entirely different channels for State and HistoricState? My current implementation uses the same channel and you could interpret HistoricState as a kind of sub-channel on the primary channel.

altaroca avatar Jun 23 '22 16:06 altaroca

IMO if timestamps in the future are allowed, something like TimeboundState would be better than HistoricState as that implies that the time is in the past.

I wonder if it would not be better to extend State with a DateTime getDateTime().(I personally would prefer a Long getTimeStamp(), but that's debatable). A default implementation could return the current time (which is similar to what we have now. If the consumer does not support timed states, but the state does have time information, the time information would be discarded.

This could also replace the need for the HistoricState in the persistence implementations.

J-N-K avatar Jun 23 '22 16:06 J-N-K

@wborn Since you created the original issue: I would like your opinion here. WDYT about adding a timestamp to State? This would require no changes at all for code that is currently not supporting timed values but allow those that implement it to use the date information.

J-N-K avatar Jul 06 '22 18:07 J-N-K

This pull request has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/panasonic-comfort-cloud-binding/133848/26

openhab-bot avatar Jul 22 '22 16:07 openhab-bot

@openhab/core-maintainers I would really like to hear your opinion here.

J-N-K avatar Aug 20 '22 13:08 J-N-K

Thanks for working on this @altaroca! :+1:

I would really like to hear your opinion here.

I don't have a strong opinion but it seems that in previous related PRs the timestamp always has been kept separate from the State as well.

See for instance this code:

wborn avatar Aug 20 '22 17:08 wborn

Yes, but I don't understand the reasoning behind that. From what @kaikreuzer said in #3033 a possible argument is that this timestamp might not be the actual timestamp of the value change. While this might be true, if the timestamp is @Nullable, a binding can chose not to set it, if it is not known. But having a timestamp together with a state will remove the need for extra-code to handle future values (e.g. forecasts) and historic values (retrieved from persistence service) and will also allow bindings that CAN determine the exact point in time of a value change to communicate that. And it is fully backward compatible because all add-ons agnostic of the timestamp will not be affected (they produce states without timestamps, just like before and they consume states ignoring the timestamp, just like before), but persistence services can properly store the timestamp together with the value. If no timestamp is present, they use the system time, like it is now.

J-N-K avatar Aug 20 '22 17:08 J-N-K

i cannot foresee all downstream implications of this change. Let's approach it semantically: state means (abbreviated from Merriam Webster)

  1. mode or condition of being,
  2. a condition or stage in the physical being of something

This definition does not refer to a timestamp. So if you wanted to always combine a State object with a time you should at least call it differently, such as StateWithTime or the above TimeboundState.

Now the question is: can openhab do without a State object and can you replace it in all places with a TimeboundState object?

altaroca avatar Aug 21 '22 07:08 altaroca

Adding forecasts on the same channels probably will also have its own set of issues. You probably don't want to persist forcasted state and actual state using the same item. I can imagine there are also devices that store their forecasts made at a certain time in a buffer. So if you then read these forecasts at a later time you might want to persist "historic forecasts". :wink:

wborn avatar Aug 21 '22 07:08 wborn

It's questionable if someone wants to persist forecasts at all (maybe to use restoreOnStartup). Anyhow, if the timestamp is persisted together with the value, I don't see any issues arising from that.

My concern is that we now have State, add HistoricState for persisted states (or states that shall be persisted) here and then add FutureState for forecasts later. Then we have three non-interchangable types for representing a state.

Keeping aside that this leads to a very bulky API (and therefore a high burden for developers), I doubt that users will benefit from that. We now separate Command and State and even that is not understood by many. Do we really expect a majority to understand why a State is not a State depending on the presence of a time information? And even more why there are two different types of States with time-information? Or would the user prefer to have one single type to deal with?

I doubt that this HistoricState can be kept internal. As soon as such functionality is added someone will request to make it available e.g. in rules.

J-N-K avatar Aug 21 '22 08:08 J-N-K

I do see usecases for persisting forecasts - electricity prices and weather are two. If my network goes down and openHAB is restarted, I would loose the ability to make decisions on these data without persistence.

IMHO a State should have a timestamp reflecting the actual time this value was sampled/produced - by the sensor/service or by the framework if the underlying sensor/service lacks this.

The State timestamp is typically different from the Event timestamp. which is the timestamp the Event itself was produced. I think we need both - if only not to mix the two concepts and create confusion.

Could we explore the feasability to create new Event types and topics for historic and forecast timeseries? Ie one (set) for bindings that produce these time series and that the framework listens for (and persists the included States with timestamp), and another set of events for rule engines to listen for when new timeseries are available? I'm not very familiar with the inner workings of openHAB here, so bear with me if it breaks backwards compat, creates a lot of work or is simply a bad idea.

seime avatar Aug 22 '22 18:08 seime

I too think that this adds way too much complexity without providing enough benefit.

Persisting forecasts/historic states is something that can easily be done in rules/through the REST API (see #12663). If a binding needs this information why not just let the user pass it in through a channel (e.g. in a rule or through an item). An additional benefit would be that the binding logic does not depend on the persistence service and the user can define a custom fallback e.g. if the service is not available or some other error occurs.

Timestamps have been (partially) discussed in #3033 and imho the only place where it might make sense is for a new ItemStateUpdatedEvent. There it's clear that it's the timestamp of the item state update.

spacemanspiff2007 avatar Aug 23 '22 04:08 spacemanspiff2007

Persisting forecasts/historic states is something that can easily be done in rules/through the REST AP

I don't think a binding should be forced to use the REST API - IMHO the REST API is for external use and forcing "internal" users to use this feels like a kludge and (IMHO) makes a binding a second class citizen.

If a binding needs this information why not just let the user pass it in through a channel (e.g. in a rule or through an item).

If you mean to provide a separate channel for the time information, then IMHO this is not a good idea. You loose the consistency - you now have to find some way to correlate the state and time into a single object and that's not easy. OH doesn't guarantee that events will be kept in order, so a state1/time1/state2/time2 stream might be received as state1/state2/time2/time1. This also stands true for other information where states are split into multiple channels (eg a dimmer value, and the rate of change) - these really should all be considered as "attributes" of a state rather than as separate channels. I know this was discussed and rejected previously though...

An additional benefit would be that the binding logic does not depend on the persistence service

I've not looked at all the comments etc, but I don't understand why adding a timestamp to a state requires interaction between the binding and the persistence. I think @J-N-K stated earlier that this should be backward compatible - if a binding adds a time to a state, then the persistence can persist the time - if not, it falls back to the existing implementation.

There are quite a few devices that can provide historical data - eg both Zigbee and ZWave meters can store information in the event that the controller is offline, and provide this historical data later. Currently it's not possible (without some sort of kludge) for the binding to incorporate this data back into the system so I would certainly support something that can do this.

cdjackson avatar Aug 23 '22 22:08 cdjackson

Adding forecasts on the same channels probably will also have its own set of issues

FYI forecasts and states could be easily filtered out through appropriate profiles. You can use a single channel as source of data and fence items with proper profile to keep forecast or actual state.

It's questionable if someone wants to persist forecasts at all (maybe to use restoreOnStartup). Anyhow, if the timestamp is persisted together with the value, I don't see any issues arising from that.

Well... a consumption of energy from beginning of month till now can be stored at the end of current month. It is stored ahead of time at the very last point of a time bucket.

Do we really expect a majority to understand why a State is not a State depending on the presence of a time information? And even more why there are two different types of States with time-information? Or would the user prefer to have one single type to deal with?

Developers and end users so far lived without a time next to state, so they will survive without it a little bit more. They will survive introduction of other kind of state if they will benefit from it. Otherwise they will complain. My personal rationale is that we can't keep state types limited to existing ones in wider scenarios. We either think ahead and design it or keep fooling ourselves that we can cover all new born scenarios with tight list of states existing since a decade (ie. by use of strings with json payloads) or individual fields and processors added by case basis here and there. Yet it will pollute and complicate core even more.

splatch avatar Aug 23 '22 23:08 splatch

Persisting forecasts/historic states is something that can easily be done in rules/through the REST AP

I don't think a binding should be forced to use the REST API - IMHO the REST API is for external use and forcing "internal" users to use this feels like a kludge and (IMHO) makes a binding a second class citizen.

My argument was not about the binding having to use the rest api. It was "If the user wants to persist future/historic states he shall do so in rules OR the rest api".

If bindings write to persistence a whole new set of issues arise:

  • what if I don't use persistence at all, can I use the binding?
  • what if my persistence doesn't provide the features the binding uses
  • what if I want to use a different persistence for historic/future states
  • Assuming that the same item is used: if things fail and I look a couple of days later at persisted values, are they values that actually happened or forecast values?

I view bindings as a hardware abstraction layer and as such it shouldn't persist forecast or historic states at all. We have this good separation between binding, persistence, rules, etc. and it's a good idea to keep these blocks separated.

If you mean to provide a separate channel for the time information, then IMHO this is not a good idea.

I mean to provide channels that accept the aggregated values as input, e.g. as comma separated values or json. It's not pretty but at least it's transparent at what's happening and whats going in/out of the binding.

From my understanding this whole "the binding shall use persistence" is only because it's not possible to pass in multiple values in a structured way. Maybe it really is time for a native json or array datatype in openHAB ... .

spacemanspiff2007 avatar Aug 24 '22 06:08 spacemanspiff2007

My argument was not about the binding having to use the rest api. It was "If the user wants to persist future/historic states he shall do so in rules OR the rest api".

Ok, no problem there then - the REST API is fine for the user :)

If bindings write to persistence a whole new set of issues arise:

Maybe i misunderstood, but I don't think anyone is proposing that bindings write to persistence? The binding updates a state - the persistence system effectively receives these events, and persists data. The binding doesn't directly communicate to persistence. There's no change from the status quo.

I view bindings as a hardware abstraction layer and as such it shouldn't persist forecast or historic states at all.

Yes, fundamentally this is correct - it's a hardware abstraction. However what if the hardware provides historic data? Eg what if the hardware provides an alert - at 3PM an alarm went off and the system wants to report this when the power comes back up at 5PM. Currently it would report that the power went off, and then came back on, both at 5PM.

Again, the binding is not "persisting" any data - it's just providing data from the hardware, with a timestamp. The persistence layer does the persisting with no direct interaction from the binding.

I mean to provide channels that accept the aggregated values as input, e.g. as comma separated values or json.

Sadly this is unfortunately the way it currently needs to work and as you say, it's very ugly and very non-standard. ie every binding does something different, and therefore there's no standardisation, and the user is left with the mess. I think the point here about adding times, or even other supplementary data as per my earlier example was to try and provide a standard way to manage this sort of information.

Maybe it really is time for a native json or array datatype in openHAB

I don't think adding a new type is the right approach as IMHO it doesn't solve the problem and creates a compatability mess. I proposed some time back the idea of adding additional attributes - eg times, speeds, etc to existing data types. This could I believe be added and made backward compatible. However this went nowhere unfortunately.

https://github.com/openhab/openhab-core/issues/1298

In any case I think that's off topic as this issue is really only looking at adding the time and not other data attributes.

cdjackson avatar Aug 24 '22 06:08 cdjackson

I think the original issue is exactly what @cdjackson describes: have a way for bindings to publish historical data with the correct timestamp if they are available. If they are only displayed, they should probably be discarded, if they are persisted, this will be taken care of by the persistence service that is responsible for handling states of the respective item.

This is not about entering historical data via UI (this should clearly be handled by the REST API, and I believe the necessary endpoints are already there). For bindings to publish historical data or forecasts the REST API is not the way to go. I believe it boils down to three options:

  1. Add a timestamp to the State. This can be done a) by adding a getTimeStamp to the State interface or b) adding a new interface TimeboundState that can be implemented by the implementations of State (or even HistoricalState and Forecast, that would allow us to handle state updates differently depending on what they are by simple instance-checks).
  2. Add a new datatype TimeboundState and duplicate all implementations of State, so that we have TimeboundDecimalType, TimeboundHSBType and so on.
  3. Add new methods like updateState(State state, long timestamp) or similar.

IMO 2 is the worst solution, 3 is most simple (but clutters API) and 1b is the most flexible but needs some thoughts on the handling. The big plus here is that it is fully backward compatible. 1b would also be a possible way for #1298 (add ExtendedState interface for additional data).

J-N-K avatar Aug 24 '22 15:08 J-N-K

2. Add a new datatype TimeboundState and duplicate all implementations of State, so that we have TimeboundDecimalType, TimeboundHSBType and so on.

This is deeply wrong. Time is additional marker for a state and can be composed with it. With wrapper/decorators you can keep raw states as is. See example I made: https://github.com/openhab/openhab-core/pull/3000#issuecomment-1164560513. By this way timebound states will be ignored unless affected places acknowledge it and do necessary unpack call.

splatch avatar Aug 24 '22 15:08 splatch

  1. Add a timestamp to the State.

My vote goes here. Possibly an extended interface as you suggest might be a good idea as it can act as a marker (as you also mention).

1b would also be a possible way for https://github.com/openhab/openhab-core/issues/1298 (add ExtendedState interface for additional data).

Again, maybe off topic, but I'd be very keen if this was picked up again and happy to help with this. I think it's probably a V4 issue (I tried pushing it for the OH2 -> OH3 transition) more because it's a significant change/addition to the concept, even if I think it's not a breaking change.

cdjackson avatar Aug 24 '22 19:08 cdjackson

If I see it correctly, the PR at hand more or less corresponds to option (3), which is clearly the least intrusive.

I like the separation between State and HistoricState and think the way they are used are very different. The reason I am reluctant with adding a timestamp to the State itself (besides what I already wrote in #3033) is that State updates on the event bus are considered to be a time-series. That is: A continuous data stream which is unmodifiable and with a strict order by time - i.e. more recent events in the stream are always also newer datapoints. This is what everything relies on. No consumer (like the various UIs) have to check whether they ever received a state with a more recent timestamp before showing the latest received state update - it would make their logic very complex if state updates could suddenly come in un-chronological order. Please note here that he very first persistence service in openHAB was the "logging persistence". All it did was writing the state updates to a log file in sequential order.

I understand @cdjackson's use cases and I agree that allowing bindings to also provide historical values can be helpful. But using HistoricStates would nicely avoid above issues and express that this is data for specific use cases that should hence be also handled exceptionally.

kaikreuzer avatar Aug 25 '22 21:08 kaikreuzer

The reason I am reluctant with adding a timestamp to the State itself (besides what I already wrote in #3033) is that State updates on the event bus are considered to be a time-series. That is: A continuous data stream which is unmodifiable and with a strict order by time - i.e. more recent events in the stream are always also newer datapoints.

Where can I find this limitation? There is nothing in the developer documentation that requires any order for events or guarantees an order. Anyway, we could work around that by only sending events over SSE that DO NOT implement TimeboundState. This way we could still use all the infrastructure we currently have for handling state updates.

J-N-K avatar Aug 26 '22 15:08 J-N-K

Where can I find this limitation?

I don't think this is anywhere explicitely written, but it is an implicit assumption by the overall design. As mentioned, the UIs e.g. rely on the fact that the event stream is chronologically sorted, so does every persistence service, etc.

we could work around that by only sending events over SSE that DO NOT implement TimeboundState

I don't think that this would be a clean solution. The source of truth are the events on the event bus, not the ones streamed through SSE. What we could do is to introduce a new event type of TimeboundStateEvents. This way we could make sure that existing logic never comes across them (as it might misinterpret them) and only new code that is specifically interested in them could handle them. I'm not sure, if it is worth the effort, though, or if option (3) wouldn't do the trick in a more lightweight fashion.

kaikreuzer avatar Sep 12 '22 20:09 kaikreuzer

This pull request has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/entsoe-e-binding-for-nordpool-spot-prices/143833/4

openhab-bot avatar Jan 29 '23 14:01 openhab-bot