EventHub icon indicating copy to clipboard operation
EventHub copied to clipboard

user_property vs property

Open bonswouar opened this issue 11 years ago • 13 comments

Seeing the documentation, it looks like there are two types of proprieties. Apparently some should be related to an User and some to an Event. But I don't see how to use those (with the Dashboard for example), and I'm not sure it's actually working. Here is how I do :

  • I tried to set some, using the API, as a second parameter of eventHub.identify() : it apparently sets it in the event (I can see this property on the last event of /users/timeline of this User, and in the Dashboard for this Event only)
  • I tried to set some using eventHub.register() : It's not on the /users/timeline, I can't find this property anywhere ?!

And when I execute curl http://localhost:8080/users/keys it always returns an empty array.

Did I miss something in the documentation ?


EDIT : Oh, when I use curl to update user information as in the documentation, it appareard in /users/keys (though I still don't know how to use them in the Funnel/Cohort dashboard).

And I can see it doesn't send those user_proprieties in the network call. Here is my JS :

        var clientId = '1234';
        var name = "EventHub";
        var options = {
          url: 'http://localhost:8080',
          flushInterval: 1
        };
        var eventHub = window.newEventHub(name, options);
        eventHub.identify(clientId, {'ip2':'12.0.0.1'});
        eventHub.track('souscription', null);
        eventHub.register({
          'ip': '127.0.0.1'
                  });

And here is the call it generates : http://localhost:8080/events/batch_track?callback=DevTips.jsonp._callback0&events=%5B%7B%22ip2%22%3A%2212.0.0.1%22%2C%22event_type%22%3A%22souscription%22%2C%22external_user_id%22%3A%221234%22%7D%5D

bonswouar avatar Oct 14 '14 14:10 bonswouar

Sorry about the confusion. When you use eventHub.identify or eventHub.register, those user properties will be stored in browser's local storage and those user properties will get merged into event properties as the events are tracked. That's why when you try to get the user properties from the backend, it doesn't show anything.

On the other hand, when you update the user information through the curl command. It directly updates the user from the backend and those updates will be reflected in the /users/keys

chengtao avatar Oct 16 '14 22:10 chengtao

Okay, thanks for the explaination !

Though, while eventHub.identify works as you said, apparently I have a problem (or still a misunderstanding?) with eventHub.register. I changed the order (register before track, so that the register parameters will be merged into the track properties :

        var clientId = '1234';
        var name = "EventHub";
        var options = {
          url: 'http://localhost:8080',
          flushInterval: 1
        };
        var eventHub = window.newEventHub(name, options);
        eventHub.identify(clientId, {'ip2':'12.0.0.1'});
        eventHub.register({
          'ip': '127.0.0.1'
                  });
        eventHub.track('souscription', {'ip3' : "192.168.0.0"});

Here is the (decoded) call generated :

http://localhost:8080/events/batch_track?callback=DevTips.jsonp._callback0&events=[{"ip2":"12.0.0.1","ip3":"192.168.0.0","event_type":"souscription","external_user_id":"1234"}]

(the register parameters haven't been merge, unlike identify parameters)

bonswouar avatar Oct 17 '14 07:10 bonswouar

this scenario is kinda tricky as identify and register were not meant to be used together. The properties specified via eventHub.register will update the user properties of a system generated user while the properties specified via eventHub.identify will update the user properties of the specified user. When eventHub.track is called, the system will prefer using the identified user to the system generated user

chengtao avatar Oct 17 '14 07:10 chengtao

Okay I got it now, thanks for your answers !

bonswouar avatar Oct 17 '14 07:10 bonswouar

Though, maybe it would be useful to add it to the documentation (that they're not compatible together), or to change this (for example merging both anyway ?).

Actually I don't really understand why this complexity : wouldn't it be easier to use the same function for setting properties, no matter if it's a generated user or a specified one ?

As for alias/identify, why not use only identify (for example) and if there is already a generated user just automatically "alias" it ?

Those are just suggestions, and maybe I don't understand all the complexity of this JS library, but anyway thank you for this great job !

EDIT : Also, I was wondering, can you call alias on a not generated user ? I mean, can you alias a user more than once ? And is calling identify on a already identified user has the same effect that alias ?

EDIT bis : Okay I did some tests, and apparently alias works only on the generated user. Meaning I can't use my own ID generator with identify (to store this ID in the cookies for a longer time than a normal session) and then alias it (when the user signup for example). Is there something I missed ?

EDIT bis bis : I tried to manually call alias with the wanted ID (something like : http://localhost:8080/users/alias?callback=DevTips.jsonp._callback1&[email protected]&to_external_user_id=customPreviouslyGeneratedId) , that works.. But I discovered another thing I don't understand : apparently you can call alias only once per "alias", meaning you can't link more than one session to a specific user id ?! Again, is there something I missed ?

bonswouar avatar Oct 17 '14 08:10 bonswouar

Just to explain the situation : I'd like to be able to track users that don't "identify" (anonymous visitors) some times and map ("alias") all the precedent sessions when they suscribe (for example).

bonswouar avatar Oct 21 '14 07:10 bonswouar

The design of alias and identify is similar to mixpanel so that people who are familiar with mixpanel can quickly get started. Calling identify on identified user will simply override the user_id at the client side which is different from calling alias which tells the backend server that the from_external_user_id should be mapped to to_external_user_id.

For alias, the backend supports what you describe as it allows you to specify both from_external_user_id and to_external_user_id while the javascript client side library only allow you to set the from_external_user_id.

Also, can you elaborate more about your situation? Are you using your own id generator?

chengtao avatar Oct 21 '14 18:10 chengtao

Yes I'm using my own ID Generator (which I store in a persistent cookie).

So, if I understand right, I should always use identify with my generated ID, right ?

And then I would need to modify alias from the JS library to be able to use my own ID instead of the default generated one.

But even then, this scenario doesn't work (for testing I alias manually with curl) :

  • identify with my generated ID, track an event : http://localhost:8080/events/batch_track?callback=DevTips.jsonp._callback0&events=%5B%7B%22url%22%3A%home%22%2C%22event_type%22%3A%22pageview%22%2C%22external_user_id%22%3A%22session%3A504855349544a06ae6a1e87.07341438%22%7D%5D
  • alias this generated ID to an email (after the user sign-up for example) : http://localhost:8080/users/alias?callback=DevTips.jsonp._callback1&[email protected]&to_external_user_id=session:504855349544a06ae6a1e87.07341438

Until here that's fine, I've got my event in the timeline of the "alias" (his email - [email protected] in the example). But then :

  • the user comes back new device and/or session : identify with another custom generated ID, then track another event : http://localhost:8080/events/batch_track?callback=DevTips.jsonp._callback0&events=%5B%7B%22url%22%3A%22contact%22%2C%22event_type%22%3A%22pageview%22%2C%22external_user_id%22%3A%22session:1973252877544a086754fcf7.41512859%22%7D%5D
  • alias this new generated ID to his (previously used) email (after the user log-in for example) : http://localhost:8080/users/alias?callback=DevTips.jsonp._callback1&[email protected]&to_external_user_id=session:1973252877544a086754fcf7.41512859

=> Here when I check the timeline of the email ([email protected]), the previous event(s) has disapeared, it only keeps the ones from the last Session.

I also tried to alias the 2 custom generated IDs together, but it has no effect.

bonswouar avatar Oct 24 '14 09:10 bonswouar

Ya, alias simply points one id to another and it doesn't merge events...in your scenario, your second alias just point [email protected] to the newly created user and that's why the previous events seem to disappear while those previous events were stored under the previous user...In your scenarios, you should do one alias when user sign up (not sign in) and at the top of each page load, identify the user either via the email (if signed in) or your persisted generated id (if not signed in) from the cookie. Will that solve the problem?

On Friday, October 24, 2014, bonswouar [email protected] wrote:

Yes I'm using my own ID Generator (which I store in a persistent cookie).

So, if I understand right, I should always use identify with my generated ID, right ?

And then I would need to modify alias from the JS library to be able to use my own ID instead of the default generated one.

But even then, this scenario doesn't work (for testing I alias manually with curl) :

  • identify with my generated ID, track an event : http://localhost:8080/events/batch_track?callback=DevTips.jsonp._callback0&events=%5B%7B%22url%22%3A%home%22%2C%22event_type%22%3A%22pageview%22%2C%22external_user_id%22%3A%22session%3A504855349544a06ae6a1e87.07341438%22%7D%5D
  • alias this generated ID to an email (after the user sign-up for example) : http://localhost:8080/users/alias?callback=DevTips.jsonp._callback1&[email protected]&to_external_user_id=session:504855349544a06ae6a1e87.07341438

Until here that's fine, I've got my event in the timeline of the "alias" (his email - [email protected] javascript:_e(%7B%7D,'cvml','[email protected]'); in the example). But then :

  • the user comes back new device and/or session : identify with another custom generated ID, then track another event : http://localhost:8080/events/batch_track?callback=DevTips.jsonp._callback0&events=%5B%7B%22url%22%3A%22contact%22%2C%22event_type%22%3A%22pageview%22%2C%22external_user_id%22%3A%22session:1973252877544a086754fcf7.41512859%22%7D%5D
  • alias this new generated ID to his (previously used) email (after the user log-in for example) : http://localhost:8080/users/alias?callback=DevTips.jsonp._callback1&[email protected]&to_external_user_id=session:1973252877544a086754fcf7.41512859

=> Here when I check the timeline of the email ([email protected] javascript:_e(%7B%7D,'cvml','[email protected]');), the previous event(s) has disapeared, it only keeps the ones from the last Session.

I also tried to alias the 2 custom generated IDs together, but it has no effect.

— Reply to this email directly or view it on GitHub https://github.com/Codecademy/EventHub/issues/13#issuecomment-60361730.

朱政道 Cheng-Tao Chu

chengtao avatar Oct 24 '14 18:10 chengtao

Well, no it doesn't really solve the problem.. As expected, alias when the user sign up works. But then, if the user comes back with a new device (for example), when I identify the user it doesn't "identify" the previous events (before the user logged in), meaning for an anonymous user all events before he's logged in won't be linked to the actual user timeline. Is there any way to get around that problem (I could participate to the project if this evolution is possible) ?

bonswouar avatar Oct 27 '14 09:10 bonswouar

There are definitely ways to solve the problem and the easiest way is probably adding another endpoint, say pseudo-merge (will need to come up with some better name), in which the system will maintain the mapping from one id, say [email protected], to a set of ids which are all essentially the same user, say [email protected] and [email protected], and during the query time, we can look up all events from the id set and do the merge on the fly. If you are interested in helping build that, I can point you to where in the source code needs to be modified.

On Monday, October 27, 2014, bonswouar <[email protected] javascript:_e(%7B%7D,'cvml','[email protected]');> wrote:

Well, no it doesn't really solve the problem.. As expected, alias when the user sign up works. But then, if the user comes back with a new device (for example), when I identify the user it doesn't "identify" the previous events (before the user logged in), meaning for an anonymous user all events before he's logged in won't be linked to the actual user timeline. Is there any way to get around that problem (I could participate to the project if this evolution is possible) ?

— Reply to this email directly or view it on GitHub https://github.com/Codecademy/EventHub/issues/13#issuecomment-60566137.

朱政道 Cheng-Tao Chu

chengtao avatar Oct 27 '14 17:10 chengtao

I am definitely interested in helping. I'll take a deeper look at the source code, if you have any hint that might be helpful don't hesitate to share it !

bonswouar avatar Oct 28 '14 15:10 bonswouar

Great, all the api endpoints can be found in web/src/main/java/com/codecademy/eventhub/web/commands

We will need another index to track that, given an user id, what other user ids have events that need to be merged. The implementation of some other indices can be found in hub/src/main/java/com/codecademy/eventhub/index

Then, we will also need to modify all the public methods for query in hub/src/main/java/com/codecademy/eventhub/EventHub.java which includes getUserEvents, getFunnelCounts, and getRetentionTable...

Lastly, we will need to modify that test cases accordingly.

chengtao avatar Oct 28 '14 20:10 chengtao