Deprecation plan for PostgreSQL usage in analytics
We believe that PostHog will have to run on ClickHouse instead of Postgres as the main analytics database in the long term in order to handle larger volumes of events and provide more robust analytics capabilities. ClickHouse scales better for large datasets and building/supporting two platforms isn't cost effective.
This issue is created to build a deprecation plan that we can update and share with the community.
initial thoughts of what this should include
- [x] build and release a PostgreSQL -> Clickhouse migration plan
- [x] agree with the product team which will be the last release we will ship with PostgreSQL support
- [x] Make sure we have a hobbyist version of clickhouse deployment
- [x] Documentation for it
- [ ] Deprecated note to the app for self-hosted postgres users
- [x] Preflight & status page to check Clickhouse being up too
- [x] Email blast to everyone + personal outreach to big customers on postgres
- [ ] after/during that release:
- [ ] ship a migration to deprecate and delete all the data in PostgreSQL no longer needed
- [ ] delete all the related methods / code etc... we no longer need
- [ ] cleanup our docs
For marketing/growth I'm not sure what's the best way to find out how to reach out, but e.g. here we can see some site_urls & users_who_logged_in__0__email which might be the best email to use if we want to do email reach outs.
@kpthatsme - from a growth perspective any thoughts? I suspect to best retain all the existing Postgres customers longer term they need to migrate as otherwise their experience will get worse over time.
@joethreepwood - could the marketing team help out with some email outreach potentially something else too?
Yep, had a quick look at the scale of this and it feels like it'd be worth putting some collateral behind.
My suggestion would be that we create a dedicated landing page or blog post with a full explanation. However we can target users to direct them to that explanation, we should: email and potentially an in-app banner being the obvious ones if we can target them correctly.
The main things we'd need from the Marketing side would be:
- Knowing when exactly we want to announce this (guessing ASAP?)
- Knowing exactly who we're targeting (email addresses for Mailchimp)
I think I know enough to cover the general messaging, so we can skip over that for now.
yes ASAP. Regarding emails, the best I came up with so far is https://github.com/PostHog/posthog/issues/6662#issuecomment-979442656
@kpthatsme - from a growth perspective any thoughts? I suspect to best retain all the existing Postgres customers longer term they need to migrate as otherwise their experience will get worse over time.
Thanks for the fyi @tiina303! The way I see it, this needs to happen for all the reasons we've all chatted about and Guido mentions above, so I'm mainly looking at this from an impact standpoint.
Right now it looks like there's still a split towards postgres-based deployments.
My main reaction to this is - aren't these people already being impacted? I think that they don't have access to most of the newer things that we've built, including free things (such as funnels 2.0).
So I wonder if we've heard anything from them, if not maybe they aren't really forward pushing orgs or orgs that need advanced functionality. Alternatively, we've done a poor job messaging on the new features and reasons to upgrade.
@joethreepwood on the subject of email- and before investing in the landing page - it might be worth reaching out to a few of these ~50 or so self-hosted orgs and ask them if they know about the new features if they upgrade to ClickHouse, wdyt? I'm thinking it might help inform the content needed on that page.
So I wonder if we've heard anything from them.
I haven't heard anything from anyone on this topic, fwiw.
yes ASAP. Regarding emails, the best I came up with so far is [#6662 (comment)]
Thanks Tiina. I saw that, but I couldn't see an easy way to export the list of emails from that query. This is probably me just being a dullard?
I've carved out some time this week to start working on the messaging. I think we should absolutely start with an email.
Would it be possible to do an in-app banner that targets these users too? This would probably be the most powerful way to hit active users.
@kpthatsme
Right now it looks like there's still a split towards postgres-based deployments.
Here's a bit of a different split from your graph (split by version and set to unique users + filtering by not clickhouse though maybe not important here and only on older versions where we didn't send realm, sent lots of reports): we see only posthog versions from 1.29 as we didn't likely send the org usage report before.
Here we can see more versions, though we might still be missing some.
If we look at the split by realm only over 30 days we can see that we're slowly decreasing the number of Postgres (275) & increasing the number of Clickhouse users (140).
My main reaction to this is - aren't these people already being impacted? I think that they don't have access to most of the newer things that we've built, including free things (such as funnels 2.0).
Yes they are. Furthermore there are some Postgres users on much older versions of PostHog as you could see above. One of the reasons we have many older versions could be that folks deployed and forgot about it (if it was free on Heroku for example).
Alternatively, we've done a poor job messaging on the new features and reasons to upgrade.
The problem might be that we left the impression that migration isn't easy from Postgres & we're still working on that, which isn't the case anymore, but we haven't blasted about that much afaik.
@joethreepwood
Would it be possible to do an in-app banner that targets these users too? This would probably be the most powerful way to hit active users.
We could, but the tricky thing here is that users would only get it if they upgrade (since they are self hosted), given that we want to reach out to some by email I suggest we do that for folks who are most up to date, i.e. postgres users on 1.30 version (though they might have set up automation for upgrades & aren't using Posthog they are still the best bet) & we can revisit the banner idea afterwards.
I saw that, but I couldn't see an easy way to export the list of emails from that query. This is probably me just being a dullard?
Nope not easy. Let me get back to you on that, specifically for 1.30 postgres users emails (about 59 based on the earlier graphs).
One of the reasons folks haven't moved is because it's not as simple to self host quite as yet compared to what it was on Heroku. On that note in the email/blog post we'll probably want to have a section about what to use, proposing depending on situation either to use cloud, hobbyist curl, digital ocean, ... the platform team can write that part.
Here are some emails:
Check the details below this insight
Not as good alternative: https://docs.google.com/spreadsheets/d/1iB-tcsKSrMp1A0hfepAQ1wNO3rQWzS1qDkB77z9ouhw/edit?usp=sharing I got this from events search downloaded the data & searched for the email column that I then copied to the front.
OK, I'm put together some email copy. We're going to need a Mailchimp design and optionally some artwork, so I've split that out into a different issue to stop it getting too noisy up in here.
@guidoiaquinti you created this issue so I've tagged you there to review the copy and make sure it's technically correct. Let me know if this is wrong!
I would definitely appreciate everyone's input on the issue of a deadline, which I think would be valuable.
After taking some feedback, we'll now send two emails about this deprecation - probably two weeks apart. Tim has indicated we can push these emails at any point from a product perspective, but I think we should be confident in the documentation we're pointing towards. It's currently marked as being in beta.
@guidoiaquinti @tiina303 Do you have an idea of when you'd be happy finalising the documentation? I can then time the first email to that.
https://github.com/PostHog/posthog.com/pull/2490 from Yakko already + https://github.com/PostHog/posthog.com/pull/2569
Here are some emails: Check the details below this insight
Not as good alternative: https://docs.google.com/spreadsheets/d/1iB-tcsKSrMp1A0hfepAQ1wNO3rQWzS1qDkB77z9ouhw/edit?usp=sharing I got this from events search downloaded the data & searched for the email column that I then copied to the front.
There are some differences between these sources, so for now I'm merging them together and will send the email to all of them.
@tiina303 do you have an update about this?
@EDsCODE was working on the code side to remove all Postgres analytics stuff, we can probably close this issue if you're tracking that work elsewhere.
Were those points completed?
- ship a migration to deprecate and delete all the data in PostgreSQL no longer needed
- delete all the related methods / code etc... we no longer need
- cleanup our docs
Point #2 is in progress at various points. It will be fully prioritized in the next sprint
👋 Hi! Here I am again for a checkin cc @tiina303 @EDsCODE
Still pending! Havent gotten back to cohorts yet