snowplow-javascript-tracker
snowplow-javascript-tracker copied to clipboard
Option to reset pageViewId.
Describe the bug
If we create multiple tracker's using newTracker, and call window.snowplow("trackPageView") for the first time,
Snowplow creates similar pageViewIDs for all the pageViews,
where as for further request's if window.snowplow("trackPageView") is called, we get different pageViewIds for different tracker's available.
To Reproduce Consider the Following,
<script>
;(function(p,l,o,w,i,n,g){if(!p[i]){p.GlobalSnowplowNamespace=p.GlobalSnowplowNamespace||[]; p.GlobalSnowplowNamespace.push(i);p[i]=function(){(p[i].q=p[i].q||[]).push(arguments) };p[i].q=p[i].q||[];n=l.createElement(o);g=l.getElementsByTagName(o)[0];n.async=1; n.src=w;g.parentNode.insertBefore(n,g)}}(window,document,"script","https://cdnjs.cloudflare.com/ajax/libs/snowplow/2.18.2/sp.min.js","snowplow"));
snowplow('newTracker', 'tracker1', 'collector.tracker1.com', initOptions);
snowplow('newTracker', 'tracker2, 'collector.tracker1.com', initOptions);
</script>
```
After this first step if we run `snowplow('trackPageView');`, **tracker1** and **tracker2** will have same pageViewID's.
but after the first trackPageView, if we run again the same **snowplow('trackPageView');**, **tracker1** and **tracker2** will have different pageViewID's respectively.
Above example is an use case for a single page application, where in we are calling trackPageView on page load as well as when a user navigates to different route without reloading the page.
Hello @VIKIVIKA , thanks for reporting this, we will be taking a look and update you soon :)
In the meantime a codesandbox with the reproduction would be really helpful. Cheers!
@igneel64 thank you for the response, here is the sandbox with reproducible code,
https://codesandbox.io/s/withered-cache-xjio6q?file=/src/App.js
Also, the screenshot's of the pageViewID's behaviour, check both on page load and on manually method call behaviour,

you can click on the Track PageView button to manually call to create Page Views.

@igneel64 even the event's we generate apart from page view are using the last or the recent tracker page view id, even we i attach the event's to specific tracker.
tracker 1 page view id:

tracker 2 page view id:

SD event attached to tracker 1, but show's tracker2 page view id:

SD event attached to tracker 2, show's tracker2 page view id:

This way all the event's are being tracked as part of only the last page view id, even if we associate with specific tracker,
am i missing something here? or is there any way to link the event's to a particular tracker and with respective page view id?
@VIKIVIKA Thank you for the reproduction link!
From a quick look on the sandbox, it seems like you are logging the pageViewId
at a point where the output of getPageViewId
is expected to be different.
From our point of view, this feature is set up so that for each page view you have the same id in all trackers in a sort of shared state which makes analysis sane. So when you call trackPageView
on a tracker, the pageViewId is updated and is shared in subsequent calls from other trackers as well, except for another pageView event.
In the sandbox as I can see, you log the pageViewId at a different point for each tracker and particularly after a new pageview event has been triggered. By changing the code to something like:
window.trackSnowplowPageView = () => {
window.snowplow("trackPageView:tracker1");
window.snowplow(function () {
console.log("pageViewId1", this.tracker1?.getPageViewId());
console.log("pageViewId2", this.tracker2?.getPageViewId());
});
window.snowplow("trackPageView:tracker2");
window.snowplow(function () {
console.log("pageViewId1", this.tracker1?.getPageViewId());
console.log("pageViewId2", this.tracker2?.getPageViewId());
});
};
you can see that the pageViewId remains consistent.
Now based on your initial comment:
tracker1 and tracker2 will have different pageViewID's respectively.
If there is any pageview event sent, the pageViewId will change but is shared.
This way all the event's are being tracked as part of only the last page view id, even if we associate with specific tracker.
Yes, events will be tracked using the last page view id.
or is there any way to link the event's to a particular tracker and with respective page view id
In this case, you would be using the tracker name attribute to differentiate between the trackers, but I suppose that your use case requires something different.
Would it be possible you give us a hint on what you are trying to achieve in your analysis ?
Note:
The initial pageViewIds are the same because when the tracker has not sent any pageview yet, we are using the initial one, so that there is a common start.
@igneel64 We are trying to load two different tracker's on a single page application, and implementing set of event's or page pings, and expecting the following behaviour,
Example:
window.snowplow("trackPageView:tracker1"), should give pageViewId1 from tracker1,
window.snowplow("trackPageView:tracker2"), should give pageViewId2 from tracker2,
Now, let's say if i create one self describing event using tracker1, window.snowplow("trackSelfDescribingEvent:tracker1"), my expectation would be that this event will use the pageViewId generated from tracker1, but as the last pageView was generated for tracker2, this will use tracker2's pageViewId.
As we are collecting the page pings or any other event's based on the pageViewID that we receive, we might miss tracker1 page pings and event's in this case.
Also as we are using different collector's for these tracker's, and we won't get tracker1 and tracker2 at same place, in this scenario, we won't be able to map page pings based on pageViewID in collector used for tracker1, as they are referencing the pageViewId's for tracker2.
Thank you for the detailed description. We will get back to you soon :)
Just to mention that this does not seem to be a technical issue of the pageViewId
per se.
Hey @igneel64, bumping this issue as we're seeing it in our production app as we're trying to migrate collectors. We're pretty convinced this is in fact a bug in the tracker code related to how pageViewId
s are generated in multi collector setups.
We call trackPageView
on each url change in our SPA, including on initial page load. Here's an example illustrating what we're seeing:
Code:
const trackerOne = newTracker('sp1', ...);
const trackerTwo = newTracker('sp2', ...);
const track = () => {
console.log("pageViewIdOne: ", trackerOne.getPageViewId());
console.log("pageViewIdTwo: ", trackerTwo.getPageViewId());
// Since we don't specify the tracker here, it should use both
trackPageView();
console.log("pageViewIdOne: ", trackerOne.getPageViewId());
console.log("pageViewIdTwo: ", trackerTwo.getPageViewId());
}
Console output on initial page load:
pageViewId1: 52657899-e540-40ff-993b-f494fe7f7297
pageViewId2: 52657899-e540-40ff-993b-f494fe7f7297
pageViewId1: 52657899-e540-40ff-993b-f494fe7f7297
pageViewId2: 52657899-e540-40ff-993b-f494fe7f7297
Console output on page change:
pageViewId1: 52657899-e540-40ff-993b-f494fe7f7297
pageViewId2: 52657899-e540-40ff-993b-f494fe7f7297
pageViewId1: 9fd9459b-8d58-4f34-86bd-9efed31b5590
pageViewId2: 9fd9459b-8d58-4f34-86bd-9efed31b5590
The above output is correct, and what we would expect to see. On initial page load, we get a single pageViewId
that is shared across all of our trackers. The initial call to trackPageView
uses this initial page view ID. On page change, we re-generate the pageViewId`, which is updated for both trackers.
However, this misses what's actually going on behind the scenes. If you look at the actual pageViewId
s that are sent with the events, they do not all align with the above.
Sent page views on initial page load:
trackerOne event: 52657899-e540-40ff-993b-f494fe7f7297
trackerTwo event: 52657899-e540-40ff-993b-f494fe7f7297
Sent page views on page change:
trackerOne event: 6b08f421-d347-4cb2-8a04-5746c0ebb3bd
trackerTwo event: 9fd9459b-8d58-4f34-86bd-9efed31b5590
Notice that on page change, trackerOne is sending a pageViewId
that is not represented in the console.log output.
Here's what we think is going on here:
Behind the scenes, for any one call on our end to trackPageView
, the javascript tracker is iterating over each collector and regenerating the pageViewId
once for each. First, we track using the first collector and generate a pageViewId 6b08f421-d347-4cb2-8a04-5746c0ebb3bd
, then we track using the second collector and generate a new pageViewId: 9fd9459b-8d58-4f34-86bd-9efed31b5590
. By the time we console.log
the output at the end, the global pageViewId
(which is rightfully shared across all trackers) has been updated to the newer ID, so the print statements obscure what's actually happening.
We would expect thte browser tracker to have handling to ensure we generate a single pageViewId
per external call to trackPageView
. If we call trackPageView
with multiple collectors, we would expect them to only generate a single new pageViewId
, rather than re-generating on a per collector basis. The page view event is being sent as a single unit, so it would therefore make sense that the event has a single ID.
I'm not very familiar with the internals of this repo, but I think what needs to be added is some kind of additional check that ensures we only call resetPageView
for the first tracker in the list all trackers. A super naive implementation to illustrate what I mean could look something like:
export function trackPageView(event) {
firstTracker = trackers[0];
remainingTrackers = trackers.slice(1);
// Set some global state val indicating that this is the first in a set of trackers
config.isFirstTracker = true;
firstTracker.trackPageView(event);
// Reset the global state val
config.isFirstTracker = false;
dispatchToTrackers(remainingTrackers, (t) => {
t.trackPageView(event);
});
}
export function logPageView(event) {
if (pageViewSent && isFirstTracker) {
resetPageView();
}
}
Thanks, and lemme know your thoughts!
Edit:
One more note on why I think this is a bug as opposed to expected behavior:
The page ping events for the trackers fire using the latest global pageViewId
. This means that the pageView
event from all but the last tracker will be orphaned, and have no associated page ping events after page change.