snowplow-javascript-tracker icon indicating copy to clipboard operation
snowplow-javascript-tracker copied to clipboard

Option to reset pageViewId.

Open VIKIVIKA opened this issue 2 years ago • 7 comments

Describe the bug

If we create multiple tracker's using newTracker, and call window.snowplow("trackPageView") for the first time,

Snowplow creates similar pageViewIDs for all the pageViews,

where as for further request's if window.snowplow("trackPageView") is called, we get different pageViewIds for different tracker's available.

To Reproduce Consider the Following,

<script>
	;(function(p,l,o,w,i,n,g){if(!p[i]){p.GlobalSnowplowNamespace=p.GlobalSnowplowNamespace||[]; p.GlobalSnowplowNamespace.push(i);p[i]=function(){(p[i].q=p[i].q||[]).push(arguments) };p[i].q=p[i].q||[];n=l.createElement(o);g=l.getElementsByTagName(o)[0];n.async=1; n.src=w;g.parentNode.insertBefore(n,g)}}(window,document,"script","https://cdnjs.cloudflare.com/ajax/libs/snowplow/2.18.2/sp.min.js","snowplow"));
		
		snowplow('newTracker', 'tracker1', 'collector.tracker1.com', initOptions);
		snowplow('newTracker', 'tracker2, 'collector.tracker1.com', initOptions);
</script>
```	

After this first step if we run `snowplow('trackPageView');`, **tracker1** and **tracker2** will have same pageViewID's.

but after the first trackPageView, if we run again the same **snowplow('trackPageView');**,  **tracker1** and **tracker2** will have different pageViewID's respectively.

Above example is an use case for a single page application, where in we are calling trackPageView on page load as well as when a user navigates to different route without reloading the page.

VIKIVIKA avatar Oct 25 '22 18:10 VIKIVIKA

Hello @VIKIVIKA , thanks for reporting this, we will be taking a look and update you soon :)

In the meantime a codesandbox with the reproduction would be really helpful. Cheers!

igneel64 avatar Oct 26 '22 07:10 igneel64

@igneel64 thank you for the response, here is the sandbox with reproducible code,

https://codesandbox.io/s/withered-cache-xjio6q?file=/src/App.js

Also, the screenshot's of the pageViewID's behaviour, check both on page load and on manually method call behaviour,

image

you can click on the Track PageView button to manually call to create Page Views.

image

VIKIVIKA avatar Oct 26 '22 18:10 VIKIVIKA

@igneel64 even the event's we generate apart from page view are using the last or the recent tracker page view id, even we i attach the event's to specific tracker.

tracker 1 page view id:

image

tracker 2 page view id:

image

SD event attached to tracker 1, but show's tracker2 page view id:

image

SD event attached to tracker 2, show's tracker2 page view id:

image

This way all the event's are being tracked as part of only the last page view id, even if we associate with specific tracker,

am i missing something here? or is there any way to link the event's to a particular tracker and with respective page view id?

VIKIVIKA avatar Oct 27 '22 04:10 VIKIVIKA

@VIKIVIKA Thank you for the reproduction link!

From a quick look on the sandbox, it seems like you are logging the pageViewId at a point where the output of getPageViewId is expected to be different.

From our point of view, this feature is set up so that for each page view you have the same id in all trackers in a sort of shared state which makes analysis sane. So when you call trackPageView on a tracker, the pageViewId is updated and is shared in subsequent calls from other trackers as well, except for another pageView event.

In the sandbox as I can see, you log the pageViewId at a different point for each tracker and particularly after a new pageview event has been triggered. By changing the code to something like:

      window.trackSnowplowPageView = () => {
        window.snowplow("trackPageView:tracker1");
        window.snowplow(function () {
          console.log("pageViewId1", this.tracker1?.getPageViewId());
          console.log("pageViewId2", this.tracker2?.getPageViewId());
        });
        window.snowplow("trackPageView:tracker2");
        window.snowplow(function () {
          console.log("pageViewId1", this.tracker1?.getPageViewId());
          console.log("pageViewId2", this.tracker2?.getPageViewId());
        });
      };

you can see that the pageViewId remains consistent.

Now based on your initial comment:

tracker1 and tracker2 will have different pageViewID's respectively.

If there is any pageview event sent, the pageViewId will change but is shared.

This way all the event's are being tracked as part of only the last page view id, even if we associate with specific tracker.

Yes, events will be tracked using the last page view id.

or is there any way to link the event's to a particular tracker and with respective page view id

In this case, you would be using the tracker name attribute to differentiate between the trackers, but I suppose that your use case requires something different.

Would it be possible you give us a hint on what you are trying to achieve in your analysis ?

Note:

The initial pageViewIds are the same because when the tracker has not sent any pageview yet, we are using the initial one, so that there is a common start.

igneel64 avatar Oct 27 '22 06:10 igneel64

@igneel64 We are trying to load two different tracker's on a single page application, and implementing set of event's or page pings, and expecting the following behaviour,

Example:

window.snowplow("trackPageView:tracker1"), should give pageViewId1 from tracker1,

window.snowplow("trackPageView:tracker2"), should give pageViewId2 from tracker2,

Now, let's say if i create one self describing event using tracker1, window.snowplow("trackSelfDescribingEvent:tracker1"), my expectation would be that this event will use the pageViewId generated from tracker1, but as the last pageView was generated for tracker2, this will use tracker2's pageViewId.

As we are collecting the page pings or any other event's based on the pageViewID that we receive, we might miss tracker1 page pings and event's in this case.

Also as we are using different collector's for these tracker's, and we won't get tracker1 and tracker2 at same place, in this scenario, we won't be able to map page pings based on pageViewID in collector used for tracker1, as they are referencing the pageViewId's for tracker2.

VIKIVIKA avatar Oct 27 '22 07:10 VIKIVIKA

Thank you for the detailed description. We will get back to you soon :) Just to mention that this does not seem to be a technical issue of the pageViewId per se.

igneel64 avatar Oct 27 '22 10:10 igneel64

Hey @igneel64, bumping this issue as we're seeing it in our production app as we're trying to migrate collectors. We're pretty convinced this is in fact a bug in the tracker code related to how pageViewIds are generated in multi collector setups.

We call trackPageView on each url change in our SPA, including on initial page load. Here's an example illustrating what we're seeing:

Code:

  const trackerOne = newTracker('sp1', ...);
  const trackerTwo = newTracker('sp2', ...);

  const track = () => {
    console.log("pageViewIdOne: ", trackerOne.getPageViewId());
    console.log("pageViewIdTwo: ", trackerTwo.getPageViewId());

    // Since we don't specify the tracker here, it should use both
    trackPageView();

    console.log("pageViewIdOne: ", trackerOne.getPageViewId());
    console.log("pageViewIdTwo: ", trackerTwo.getPageViewId());
  }

Console output on initial page load:

pageViewId1: 52657899-e540-40ff-993b-f494fe7f7297
pageViewId2: 52657899-e540-40ff-993b-f494fe7f7297
pageViewId1: 52657899-e540-40ff-993b-f494fe7f7297
pageViewId2: 52657899-e540-40ff-993b-f494fe7f7297

Console output on page change:

pageViewId1: 52657899-e540-40ff-993b-f494fe7f7297
pageViewId2: 52657899-e540-40ff-993b-f494fe7f7297
pageViewId1: 9fd9459b-8d58-4f34-86bd-9efed31b5590
pageViewId2: 9fd9459b-8d58-4f34-86bd-9efed31b5590

The above output is correct, and what we would expect to see. On initial page load, we get a single pageViewId that is shared across all of our trackers. The initial call to trackPageView uses this initial page view ID. On page change, we re-generate the pageViewId`, which is updated for both trackers.

However, this misses what's actually going on behind the scenes. If you look at the actual pageViewIds that are sent with the events, they do not all align with the above.

Sent page views on initial page load:

trackerOne event: 52657899-e540-40ff-993b-f494fe7f7297
trackerTwo event: 52657899-e540-40ff-993b-f494fe7f7297

Sent page views on page change:

trackerOne event: 6b08f421-d347-4cb2-8a04-5746c0ebb3bd
trackerTwo event: 9fd9459b-8d58-4f34-86bd-9efed31b5590

Notice that on page change, trackerOne is sending a pageViewId that is not represented in the console.log output.

Here's what we think is going on here:

Behind the scenes, for any one call on our end to trackPageView, the javascript tracker is iterating over each collector and regenerating the pageViewId once for each. First, we track using the first collector and generate a pageViewId 6b08f421-d347-4cb2-8a04-5746c0ebb3bd, then we track using the second collector and generate a new pageViewId: 9fd9459b-8d58-4f34-86bd-9efed31b5590. By the time we console.log the output at the end, the global pageViewId (which is rightfully shared across all trackers) has been updated to the newer ID, so the print statements obscure what's actually happening.

We would expect thte browser tracker to have handling to ensure we generate a single pageViewId per external call to trackPageView. If we call trackPageView with multiple collectors, we would expect them to only generate a single new pageViewId, rather than re-generating on a per collector basis. The page view event is being sent as a single unit, so it would therefore make sense that the event has a single ID.

I'm not very familiar with the internals of this repo, but I think what needs to be added is some kind of additional check that ensures we only call resetPageView for the first tracker in the list all trackers. A super naive implementation to illustrate what I mean could look something like:

export function trackPageView(event) {
  firstTracker = trackers[0];
  remainingTrackers = trackers.slice(1);

  // Set some global state val indicating that this is the first in a set of trackers
  config.isFirstTracker = true;

  firstTracker.trackPageView(event);

  // Reset the global state val
  config.isFirstTracker = false;

  dispatchToTrackers(remainingTrackers, (t) => {
    t.trackPageView(event);
  });
}

export function logPageView(event) {
  if (pageViewSent && isFirstTracker) {
        resetPageView();
  }
}

Thanks, and lemme know your thoughts!


Edit:

One more note on why I think this is a bug as opposed to expected behavior:

The page ping events for the trackers fire using the latest global pageViewId. This means that the pageView event from all but the last tracker will be orphaned, and have no associated page ping events after page change.

GideonShils avatar Oct 06 '23 20:10 GideonShils