snowplow-javascript-tracker icon indicating copy to clipboard operation
snowplow-javascript-tracker copied to clipboard

Add an option to generate the page view ID according to changes in the page URL to account for events tracked before page views in SPAs (close #1307 and #1125)

Open matus-tomlein opened this issue 1 year ago • 5 comments

Issue #1307

Problem

In SPAs, as the user is navigating through pages, it may happen that certain events belonging to a page are tracked before the page view event. For example, auto-tracked events relating to the experiment selection in A/B testing may be tracked before the actual page view. We would want these events to share the same page view ID as the page view event that is tracked for the same page URL even though that is tracked later.

Currently the page view ID is generated at the time of the page view event, so all events tracked before it have a different page view ID (this is not the case for the very first page view event, but is for the second and subsequent events).

Solution

This PR adds an option called preservePageViewIdForUrl to the tracker configuration which has the following settings:

  • false (default) – keep the current behaviour (i.e., new page view ID is generated when new page view event is tracked)
  • true or 'full' – the page view ID will be kept the same for all page views with that exact URL (even for events tracked before the page view event).
  • 'pathname' – the page view ID will be kept the same for all page views with the same pathname (search params or fragment may change).
  • 'pathnameAndSearch' – the page view ID will be kept the same for all page views with the same pathname and search params (fragment may change).

Thus, if this option is configured to something else than false, events that happen on the second page but precede the second page view will have the same page view ID as the second page view. To illustrate, it will work as follows:

  1. User navigates to page http://example.com/a
  2. Custom event A is tracked with page view ID 1
  3. Page view event is tracked with page view ID 1
  4. Custom event B is tracked with page view ID 1
  5. User clicks on a link that takes them to http://example.com/b
  6. Custom event C is tracked with page view ID 2 (this is what this PR enables, previously it would have page view ID 1)
  7. Page view event is tracked with page view ID 2
  8. Custom event D is tracked with page view ID 2

This PR also fixes issue #1125 which reported the problem that if two trackers are initialized on the same page, the second page view event will have different page view IDs for each of them (the first will have the same). If this new option is configured, the second page view event will have the same page view ID on both trackers.

cc @davidher-mann and @j7i – we discussed this problem on our call, would be great to get your feedback if this solution fixes the issue for you!

matus-tomlein avatar May 02 '24 13:05 matus-tomlein

BundleMon

Files added (6)
Status Path Size Limits
:white_check_mark: libraries/browser-tracker-core/dist/index.mod
ule.js
+27.29KB 28KB / +10%
:white_check_mark: trackers/javascript-tracker/dist/sp.js
+25.37KB 25.5KB / +10%
:white_check_mark: trackers/javascript-tracker/dist/sp.lite.js
+15.68KB 16KB / +10%
:white_check_mark: trackers/browser-tracker/dist/index.umd.min.j
s
+15.53KB 16KB / +10%
:white_check_mark: libraries/tracker-core/dist/index.module.js
+13.36KB 15KB / +10%
:white_check_mark: trackers/browser-tracker/dist/index.module.js
+3.51KB 5KB / +10%

Total files change +100.74KB 0%

Final result: :white_check_mark:

View report in BundleMon website ➡️


Current branch size history

bundlemon[bot] avatar May 02 '24 14:05 bundlemon[bot]

That was quick Matus - thanks!

davidher-mann avatar May 02 '24 14:05 davidher-mann

Thank you @matus-tomlein - I will be able to look into it on Monday.

j7i avatar May 02 '24 15:05 j7i

This solution should solve our current problem, and I also appreciate the options provided to further specify an appropriate strategy such as pathname. Thank you!

However, we are currently aiming for a deferred mode where we delay the execution of tracking events (should go live this week). In this case, we would actually fire tracking events on a new page, even though they may originate from a previous page. We do this to shorten presentation delays on interactions.

With the current solution this would result in this scenario (when running in deferred mode; page view events are excluded from deferral):

  1. User navigates to page http://example.com/a (generate page view ID 1)
  2. Custom event A is scheduled
  3. Custom event A is tracked with page view ID 1
  4. Page view event is tracked with page view ID 1
  5. Custom event B is scheduled
  6. --- User clicks on a link that takes them to http://example.com/b (generate page view ID 2)
  7. Custom event B is tracked with page view ID 2 (should be page view ID 1 👀)
  8. Custom event C is scheduled
  9. Page view event is tracked with page view ID 2
  10. Custom event C is tracked with page view ID 2

We will consider to checkout this strategy in combination with performance improvements regarding the cookie strategy / expensive cookie reads/writes. As we may can get rid off the deferral mode as soon as tracking calls are less expensive.

j7i avatar May 06 '24 16:05 j7i

As an addition to my previous comment: If there is some sort of override available in combination with a global context getter, this could work with a deferred approach. But I think this would be another topic to further discuss / explore.

/** This code is for demonstration purposes only */

// should also return the web page entity among the other global ones
import { getGlobalContext } from "@snowplow/browser-tracker";

const globalContext = getGlobalContext(); // could receive a tracker id as an argument
const timestamp = Date.now();
const event = { ... };
const context = [ ... ]; 

requestIdleCallback(() => 
  trackSelfDescribingEvent({
    event,
    context,
    globalContext,
    timestamp,
  });
);

j7i avatar May 06 '24 16:05 j7i