divolte-collector
divolte-collector copied to clipboard
Session and party identifiers don't time out within a page-view even if they should
Problem Statement
When a session is created and there is no activity (i-e no events/tags generated from the client .js) during a window time beyond the session timeout (default 30 minutes), the session id is not regenerated for the next event while it should be.
I can't see in the code what is done really with this session_timeout so far a part from being taken from the conf if defined there or setup to the default value (30 minutes).
Suggested fix
I would assume that for each call to the signal method, the session timeout should be used to check wether the amount of time, elapsed since the latest event sent and now, is beyond the timeout or not. If the gap is beyond the session timeout, then signal() method should generate a new sessionId and update the cookie accordingly prior to send the event/tag.
If there is a good reason for this issue to not be a bug, please explain why ? And also in such case what is the real semantic of the session_timeout Thx
Looks like a bug.
The session timeout is used to set the expiration on the session cookie. If the page stays open, but idle for the duration of the session timeout, the in-memory session ID that the script holds is not invalidated. The same holds for the isFirstInSession
boolean flag.
Perhaps the safest way of checking for expiry is to re-parse the document cookies on each signal invocation. This would also be robust against a user deleting their cookies mid-session. This extends to the party ID and other information stored in cookies.
Thanks Friso for your quick feedback. Your proposal sounds nice to me; i-e checking for cookies session on each signal invocation and regenerating the session id and the session cookie if needed.
Now considering the partyId, the doc says it is identifying a client; i-e smth common to set of distinct client sessions. But I've noticed that when you close a tab (or browser) and re-open it, the partyId is regenerated as well while I thought it would not change. Therefore it sounds like identifying a tab more than a client. Can you elaborate on what it is exactly. Maybe there is smth wrong here. Pls tell me if I need to open a new bug for that as well. Thanks Jerome
@jerome73 A party ID is a long lived cookie. It is the same for each tab in a normal browser instance. Getting a new cookie for each tab implies some kind of private mode or blocking of known trackers or otherwise.
Ok. When do you think the session timeout issue will be fixed ? Thanks Friso
This is indeed a bug. The issue is that we only time out session or party identifiers at the start of a page view. While a page view is underway we never expire either, even if the page view exceeds the party or session timeouts.
Fixing this involves some careful thinking:
-
Is it okay that a page-view spans sessions? (Or even switches to a different party id?) My instinct is to say yes, and that basically confirms this issue as a bug.
-
How important is it that users reset everything while a page view is underway? My initial answer is "quite".
- If the user resets the party/session by deleting the cookies, should we also start a new page-view? Otherwise the reset was ineffectual because the new and old can be linked via the page view.
-
Do we wish to preserve the existing
divolte.partyId
anddivolte.sessionId
properties? (And their associatedisNew[…]
properties.)
There are two obvious implementations for this:
- Roundtrip from cookies on every call to
divolte.signal()
and continue to rely on cookies for timing out sessions and parties. As @friso mentioned we get divolte-reset-via-cookie-reset for free. But we break the currentdivolte.partyId
anddivolte.sessionId
properties since they might be out of sync. (We could use Javascript getters to make these properties dynamic, but not all browsers support this.) - Use a timer within the page to handle expiry ourselves. This would basically work the same, except we'd lose divolte-reset-via-cookie-reset. At best (with checking) we'd figure this out after the fact.
My proposal is:
- Deprecate the existing
partyId
,sessionId
,isNewPartyId
andisFirstInSessionProperties
. - Implement new
partyIdentifier()
,sessionIdentifier()
,isNewPartyIdentifier()
andisNewSessionIdentifier()
methods that check the cookies on every call and thus allow session/party identifiers to time out in the middle of a page view or if the user deletes all cookies. - If a session/party identifier reset is detected, also reset the page view identifier. However an implicit
pageView
event will not be generated. - Update the
divolte.signal()
method to use the new properties.
Also worth considering but out of scope for this issue would be a formal API to allow the page itself to reset or extend the lifetime of the identifiers.
Hi Andrew,
Thanks for your analysis and proposal. I assume that when you say "deprecate the existing partyId, sessionId", this means regenerating new partyId and sessionId. If so your proposal sounds good to me. Cheers, Jerome
Hi Andrew and Friso, What is the status of this issue ? Thanks, Jerome
Hi. Me and a colleague were analysing data from a site and we encountered this bug. How much are you close to solving this bug?
At the moment I don't think anyone has volunteered to fix it.
hi guys. i want to connect divolte to kafka and then to KSQL and do some querying but it always says :+1: ERROR Unknown error when running consumer: (kafka.tools.ConsoleConsumer$:46) org.apache.kafka.common.errors.SerializationException: Unknown magic byte!
please help :) it's walking on my nerv