posthog
posthog copied to clipboard
Improve person and groups deletion
Is your feature request related to a problem?
As part of shipping person-on-events, we should iterate on our tooling around deleting persons (and add tooling for deleting groups).
Due to person-on-events person and groups data will be stored in clickhouse events table and deleting the data from there is more expensive than usual
Describe the solution you'd like
- Uniform design for deletions both for persons and groups (cc @clarkus)
- Ideally deletion should have two options - delete just the 'person'/'group' vs delete 'person'/'group' and all associated events.
- After deletion, we should show a notification that event data is deleted asynchronously.
- Mechanism for doing bulk deletions in events table
- We delete from persons/groups tables immediately and schedule a deletion for weekend for events, where all deleted teams/persons/groups who want event data scrubbed will be removed.
Additional context
cc @clarkus This is team ingestion next sprint priority - would it be possible to cook up some designs around both deletion flows?
Thank you for your feature request – we love each and every one!
I can try to help. I'm already pretty booked with other sprint tasks, so I'll have to get to this one after those that were already scheduled. How soon do you need a solution? A few other questions:
- How long does deletion typically take?
- You mentioned it was asynchronous - what is the the state of that data in the interim until the delete completes? Does it still persist and show up in insights or is it hidden immediately until permanently deleted?
- What's the use case for deleting the group type or a person but not the associated events? Wouldn't that leave some gaps or orphaned data in some form? Are there any concerns we should communicate to the user before they confirm the action?
- Anything else we should try to communicate to users who do this?
Quick answers to get the ball rolling, will go more in depth tomorrow if needed.
How long does deletion typically take?
Event data deletion happens on a cronjob, by default configured to run once per week but can be configured in the chart.
Group / Person is deleted immediately.
You mentioned it was asynchronous - what is the the state of that data in the interim until the delete completes? Does it still persist and show up in insights or is it hidden immediately until permanently deleted?
The group or person would be deleted ~immediately. This means they wouldn't show up on the Groups & Persons page nor in any person modals in trends.
However they would still appear in trends - e.g. when counting unique users, filtering by user properties and so on.
What's the use case for deleting the group type or a person but not the associated events? Wouldn't that leave some gaps or orphaned data in some form? Are there any concerns we should communicate to the user before they confirm the action?
The deletion operation is not for group types (those things you can have up to 5 of) but for individual groups - e.g. in our case individual organizations.
Deleting associated event data might not always be needed or desired as there's no 1-1 mapping between groups and events or persons. Given a particular event can belong to a organization, instance and a person would you expect all persons who ever did a particular event in an org to be deleted? If you delete one group and automatically delete all events, it would also negatively impact other group types as their data is now incomplete.
Hence why I suggest providing the option of deleting associating events, but not defaulting there.
Anything else we should try to communicate to users who do this?
See original issue:
- It should be clear choice of whether to delete event data or not (I'd default to NO even)
- We should provide feedback when event data would be scrubbed.
I made some updates for this at https://www.figma.com/file/gQBj9YnNgD8YW4nBwCVLZf?node-id=12513:58211#243945135. There's a bit more here just to update the filter patterns, but the core thing is the delete action and the dialog that prompts for the scope of deletion.
Tangent question - groups can be expanded to show their properties. Is there ever a case where we want to allow a user to delete a specific property value or values for a specific group?
I made some updates for this
:bow: left a few questions
Tangent question - groups can be expanded to show their properties. Is there ever a case where we want to allow a user to delete a specific property value or values for a specific group?
Yes if we want to allow the same in persons.
Not in scope for this though.
Related https://github.com/PostHog/posthog/issues/9535
Adding to this that a customer requested bulk deletion of people (something like a delete request with a list of IDs or a matching filter). As dangerous as this endpoint could be, without they are simply opting to iterate over all people they need to delete and indiviudally call the delete endpoint so in essence it's the same thing but with less optimisation.
Not sure if this is the same as the issue here but definitely feels related
Sounds like the same issue just with a slightly expanded workflow. It'd mean supporting bulk actions from the list views. This is something I've been wanting to add for most list views, but it does make a lot of sense here. @macobo I can draw that for future work, but I also understand if it's out of scope for this sprint.
Adding to this that a customer requested bulk deletion of people (something like a delete request with a list of IDs or a matching filter)
This is separate from the rest of this ticket - the goal here is to iterate existing behavior with person-on-events on the horizon.
Note that we probably should dig into the product use-cases more before committing to building anything rather than serve the technical request without understanding what we're building for.
Limiting the scope of this task to only deal with persons and teams (which https://github.com/PostHog/posthog/pull/11347 + some documentation will then allow closing).
Groups requires clickhouse schema changes, which makes sense to do together with https://github.com/PostHog/posthog/issues/10248 to avoid async-migrations related issues.
I've only read the main post, but I came here looking for a way to delete a group from the UI.

I mistakenly used the wrong variable to name my "Group" (should have been "Companies"), and don't know how to fix that now.
I'm on Cloud - deleted Persons [and corresponding events] and new events for the distinctId stopped appearing in Live Events feed. And events for non idenitifed events as well.
What should the right logic here - Person 'undeleted' automatically?, explicit warning about the gap between the Person delete and events, some setting? Things aroud this should be explicit.
I'd only started playing with Posthog earlier in the day and this shook my faith in the reliability and diagnostics ability.