posthog icon indicating copy to clipboard operation
posthog copied to clipboard

Remaining tasks for person-on-events

Open macobo opened this issue 2 years ago • 3 comments

Background

person-on-events work has been ongoing for a long while now. This issue consolidates remaining work to be done under a single issue to make sure every team is aware of the remaining tasks.

What's left TO DO

Ingestion

Owners: team ingestion

  • [ ] Make sure every event has a person_id - [ ] Finish run of 0006 migration on cloud (status: 99% finished, some events from 2021 need work) - [x] Fix bug affecting ~0.001% of events https://github.com/PostHog/posthog/pull/11077 https://github.com/PostHog/posthog/pull/11084 @macobo @tiina303 - [ ] Re-migrate rows with missing person_id on cloud - [ ] Release 0006 async migration in 1.39.0
  • [ ] Release buffer as part of 1.39.0
  • [ ] Right to be forgotten: Create tooling to allow purging person and groups data from events (proposed next sprint goal)
  • [ ] Resharding events table to be sharded by person_id - not urgent, this can be done after releasing everything else

Queries

Owners: team west (cc @neilkakkar @EDsCODE)

  • [ ] Column materialization improvements - We won't be moving to JSON data type anytime soon: context. Hence we need to make property materialization code a lot better: - Make sure it functions in a stable way on cloud - person/groups column materialization logic for events table
  • [x] Update queries to use new person_created_at and other new created_at columns
  • [ ] Verify all queries work

Releasing

Once we're happy with the work above, we can enable person-on-events on cloud. I'd suggest the following release pattern:

  • [ ] Add a feature flag we can toggle for this
  • [ ] Enable for team 2, communicate internally
  • [ ] Bugsquash

Owners: @EDsCODE is team west ready to own this?

Communication

  • [ ] Updating documentation on the conversion buffer
  • [ ] Updating documentation on person-on-events and impact of this
  • ....

Owners: Unclear - originally Marcus was intended to help here. Probably me and Yakko on dev side?


Let me know if I missed anything important task-wise.

macobo avatar Aug 02 '22 09:08 macobo

Note that there's a project https://github.com/orgs/PostHog/projects/41/views/1 though some of the tasks there don't need to be there I'll let you take a look first though.

tiina303 avatar Aug 02 '22 10:08 tiina303

The project is for our (team-ingestion) team, this task is about cross-team-collaboration/syncing. Different goals.

macobo avatar Aug 02 '22 10:08 macobo

At a minimum from the stuff in the project we need the buffer roll-out to happen to everyone (cloud and self-hosted) before async migration & Re-migrate rows with missing person_id on cloud

tiina303 avatar Aug 02 '22 15:08 tiina303