posthog icon indicating copy to clipboard operation
posthog copied to clipboard

Sprint - July 1 to July 12, 2024

Open mariusandra opened this issue 1 year ago β€’ 1 comments

Global Sprint Planning

3 things that might take us down

Team sprint planning

For your team sprint planning copy this template into a comment below for each team.

# Team ___

**Support hero:** ___

## Retro

<!-- Grab the high and low priority items from last time and add whether that item was completed or not -->

- 

## Hang over items from previous sprint

<!-- For each item, decide to re-prioritise (and add below) or deprioritise -->

- Item 1. prioritised/deprioritise

## OKR

1. OKR, status (red/yellow/green) and action points if yellow/red


### High priority

-

### Low priority / side quests

-

mariusandra avatar Jun 28 '24 13:06 mariusandra

Team CDP, WIP.

We're both off the last week of the previous sprint, so posting this already.

Retro

  • Marius: did a lot of work this week to improve the editing experience.
  • Ben:

Hang over items from previous sprint

  • Secret management (probably go for simple encrypted field in django) @benjackwhite
  • Hook up to rusty webhook service (hopefully Brett, if not Marius)
  • Go through existing CodeQL reports and sanity check the safety of the system @mariusandra
  • Ensure every existing plugin destination could be built with Hog and build them as templates @mariusandra
  • Add calculation and limits for resource usage (memory)

OKR

  • Goal 1: Widespread usage goal
    • Generally available to all customers
    • 5 happy customers (tight feedback loop with them)
    • Get all post-ingestion plugins migrated to Hog Functions
    • Idea: Template gallery (publish your own template for others to use)
    • Scaling work
  • Goal 2: Messaging V1
    • Build on top of hog functions to have β€œHogWorkflows”
    • Requirements gathering - what do we need to build here
    • We should be able to replace some (or all) of our customer.io workflows with our product
  • Goal 3: Hog Functions as a building block
    • Work with other teams to spread understanding of the power of Hog (functions)
    • Generate various use cases for embeddable functions
      • Multiple sources for functions (ActivityLog, InternalEvents, Alerts)
      • More destinations for functions (tracking events, updating person properties)

Sprint plan

Megaissues: CDP & Hog

  • Goal 1 @benjackwhite
    • Launch the private beta
    • Get 5 happy users
    • Ensure the system stays up and running
  • Goal 2
    • Hook up the rusty webhook service
  • Goal 3 @mariusandra
    • Port over all existing destinations
    • Implement missing language features

mariusandra avatar Jun 28 '24 13:06 mariusandra

Team Error Obsession, Obsessing on things

Support hero: @pauldambra

  • @pauldambra - out a day and a bit
  • @marandaneto - out most of the sprint

items from previous sprint

High priority

  • βœ… Q3 goals - everyone
  • βœ… masking text in the screenshot images on iOS @marandaneto
  • πŸŒ€ Replay React native plugin for Android and iOS @marandaneto - going faster than expected - good results but tricky
  • πŸŒ€ Universal filters released for everyone @daibhin - definitely will be release ready this week
    - will roll out to team 2 and a few customers waiting on it in support tickets
    - it's a big change so we want to get some feedback before big bang
  • πŸŒ€Error tracking @daibhin @pauldambra - β›” Generate embeddings in plugin server using new S3 mounted disk
    - learned we don't need to do this first, but that we can if/when we want to
    - β›” Materialized embeddings column on the events table
    - learned this is possible but not needed yet
    - βœ… Playlist of errors & associated recordings
    - using the new virtual playlist code πŸŽ‰
    - βœ… Figure out grouping (https://github.com/PostHog/product-internal/pull/615)
    - have a plan here and need https://github.com/PostHog/posthog/issues/23395 before we can really start experimenting
    -Β βœ… first user interview done
    - great 30 minutes with $largeEUCustomer

Low priority / side quests

  • πŸŒ€ we are recruiting beta testers for Android and iOS - (we should have screenshots now) @marandaneto @annikaschmid

OKR

  1. OKR, status (red/yellow/green) and action points if yellow/red
  • 🟑 πŸ“±Goal 1: People think of PostHog as a mobile solution
  • 🟑 πŸͺ² Goal 2: Error tracking in people's hands
  • 🟑 ⁉️ Goal 3: Hiring

High priority

  • pricing changes @pauldambra
    • https://github.com/PostHog/product-internal/pull/578
    • 2 outcomes
      • price changes released
      • documentation written for next time
  • universal filters
    • testing with users and rolling out @daibhin
  • error tracking
    • ingestion fangling @pauldambra
      • https://github.com/PostHog/posthog/issues/23395
    • local stack trace demangling
    • start adding assignee/resolve issue states in postgres @daibhin
      • do we delay solving assigning to groups of people
      • assign tags to groups too
    • 3 more customer interviews @daibhin
    • sparklines @daibhin

Low priority / side quests

  • we are recruiting beta testers for Android and iOS - (we should have screenshots now) @marandaneto @annikaschmid
  • mobile support @pauldambra
    • holding the fort while @marandaneto relaxes in Brazil
  • continue investigating recordings snapshots capture @pauldambra
  • finish the rrweb alpha-16 upgrade! @daibhin and @pauldambra

pauldambra avatar Jul 03 '24 09:07 pauldambra

Team Data <->, collecting of Hogs and more

OKR Q2 2024

Objective

Query 3000

  • Key Results:
    • Autocomplete
    • Increase general BI experience/product https://github.com/PostHog/meta/issues/157
    • Declutter the data warehouse UI and make the features intuitive to find

Data Modeling MVP

  • Key Results:
    • Infrastructure decided and implemented
    • Integrating external data with feature flags
    • External data everywhere in insights/persons/cohorts
    • Get billing team to use modeling in posthog for their invoices_with_annual table

Retro

  • [x] postgres and other database incremental syncs @Gilbert09
  • [ ] WIP onboarding flow @EDsCODE
  • [ ] WIP errors from syncing and linking shown @EDsCODE
  • [x] person model batch exports @tomasfarias
  • [x] completing these todos for pricing release: https://github.com/PostHog/posthog/issues/22669
    • Marketing PRs are in review
    • Billing decided on and all logic all in review

High Priority

  • [ ] finish launching pricing @EDsCODE
  • [ ] finish work for showing errors from syncs @EDsCODE
  • [ ] Improve error visibility from querying @Gilbert09
  • [ ] data modeling (TBD after planning meeting) @tomasfarias
  • [ ] add historical exports to pipeline 3000

EDsCODE avatar Jul 03 '24 13:07 EDsCODE

Team Product Analytics

Support hero: Week 1: @thmsobrmlr @skoob13 (secondary) Week 2: @Twixes @thmsobrmlr (secondary)

Time off: @aspicer only there for first three days

Retro

  • 🟑 Getting rid of remaining legacy filters use continued (owner: @thmsobrmlr)
    • This turned out to be quite complex, as we store insights and their dashboards in memory for "fast mode", but don't actually have a normalized state. Added playwright e2e tests, which aren't yet working reliably on CI. For now back to refactoring. We'll want to add e2e tests as part of the quarterly goals and it might make sense to talk about normalization or an alternative like GraphQL to improve maintainability.
  • πŸ”΄ Experiments migrated from legacy trends/funnels to HogQL-based (owner: @thmsobrmlr)
    • Not started yet.
  • 🟒 Multiple breakdowns in Trends released to users (owner: @skoob13)
    • Probably going to be released this week. Some issues with data warehouse queries & a remaining bugfix.
  • πŸ”΄ Project Environments (owner: @Twixes)
    • Probably lots of things going on + vacation.
  • 🟒 Insight background reloads monitoring/cleanup (@webjunkie)
    • Released behind a feature flag. Improved cache lifetime, so we can roll it out more.
  • 🟒 Fixed a lot of with new support system. Fixed OOM issue with cohorts for a large customer, but one-by-one all places where we naively fetch persons breaks -> need to have a general answer. Same for properties of events.

Extra things done

  • Offsite planning
  • Working with a contributor, Nikita, who's in the process of shipping analytics alerts

High priority

  • @webjunkie Insight caching state investigation (do we pre-warming? -> inclined to remove).
  • @skoob13 Probably start experimenting with LLMs on insights (natural text -> query nodes).
  • @aspicer Driving along tickets.
  • @Twixes tbd

Low priority / side quests

  • @webjunkie Staying on alerts topic to make sure we have a good first version.

Q3 2024 objectives

  1. Rock-solid analytics (@thmsobrmlr + @webjunkie + @aspicer + @anirudhpillai)
    1. Legacy Minus – removing legacy insights code so that we can move fast
      • FilterType gone from the frontend.
      • rm -rf posthog/queries/
      • Experiments ported to HogQL.
      • All the flags from HogQL/querying work.
    2. Tests Plus – shipping fewer bugs in the first place
      • Ensure we test with the feature flags that users actually experience, both in end-to-end and integration tests.
      • When shipping changes to queries, replay old vs. new version on thousands of real queries to check for regressions.
    3. Metrics Plus – catching issues before before users report
      • Analytics performance dashboard in Grafana (query duration, failures, etc.). Paging alerts on critical metrics, e.g. if the number of queries drops rapidly, or failures rise.
      • Analytics experience dashboard in PostHog (time till data available, result freshness across insights and subscriptions, refreshes initiated manually vs. automatically, etc.)
      • Alerts on major Product Analytics errors from Sentry, and us acting on every alert. (Bonus: checking up the Sentry routing rules for the #product-analytics team.)
      • Cohorts dashboard in Grafana (successful vs. failed calculations per day, recalculation backlog). Alerts here too.
    4. Performance Plus - eliminating UX pain via maximum query performance/reliability, based on Metrics Plus data
      • Partial calculation of multi-day time series results …and more – work with Team Query Performance to find the lowest-hanging fruit, similarly to Tim's performance mega issue
    5. Support Plus – sparking joy for users when they’re led to report a bug
      • 1 hero + 1 sidekick
      • Goal: 90% of tickets fulfill the SLA
  2. Answering more product questions, deeper (@thmsobrmlr + @webjunkie + @aspicer + @anirudhpillai)
    1. Growth Plus - increasing ease of onboarding, and subsequent retention
      • Identify growth opportunities working with Anna, our product manager – implement growth optimizations and track their impact whenever possible.
      • Work with Team Growth on optimizing the onboarding experience of Product Analytics.
    2. Analysis Plus - answering more product questions, more deeply
      • Analytics alerts are out to users (implemented with the contributor)
      • β€œDone for the first time” in Trends, to kill the janky First Time Event Plugin
      • Query in new insight URL for instant insight sharing
      • Optional funnel steps
      • ...and more, based on user feedback - see the most requested features in GitHub
  3. ArtificialHog (@Twixes + @skoob13) – an LLM-based chat-like interface for answering product questions.

thmsobrmlr avatar Jul 03 '24 13:07 thmsobrmlr

Team Pipeline

Off: Brett 2 days, Xavier 2 days Support: Tiina

Retro

High priority

  • [x] Fix excessive overrides written in support of Personless mode (Brett)
  • [ ] Hog support for Rusty-Hook (Brett) (This takes backburner to the one above)
    • carry-over
  • [x] capture-rs: fix billing limits (Xavier)

Low priority / side quests

  • [x] Finish hog-rs to posthog repo migration (deploy out of posthog through state.yaml)
  • [ ] Collect rdkafka metrics (broker response latency, error rates) for all node producers & consumer (Xavier)
    • carry-over
  • [ ] capture-rs: read redis out-of-band (avoid latency if redis slow)
    • carry-over

OKR

βœ…=finished 🟒=on track to finish this quarter 🟑=might not finish πŸ”΄=won't finish βœ”οΈ=progressed last sprint ; ➑️=planned work for this sprint

🟒 Test Warpstream as PoC and decide whether to do it or not 🟒➑️ Pipeline scalability Improving pipeline throughput 🟒➑️ Help other teams ship fast 🟒 Stretch: better e2e monitoring

High priority

  • [ ] Hog support for Rusty-Hook (Brett)
  • [ ] Separate pipeline for $$heatmap events (Xavier, Tiina, Paul)
  • [ ] Inline a processEvent plugin (Oliver)

Low priority / side quests

  • [ ] Collect rdkafka metrics (broker response latency, error rates) for all node producers & consumer (Xavier)

tiina303 avatar Jul 03 '24 14:07 tiina303

Team Click Haus, Haus of the Hogs

OKR Q2 2024

Objective

James as a Service -> Clickhouse as a Service

  • P0 tasks such as
    • 🟑 Deletes
    • 🟒 Keeping clusters happy
    • 🟒 Provisioning more disks
    • 🟒 Schema Reviews
    • 🟒 Debugging
    • 🟑 Performance
    • 🟒 Backups/Restores
  • Decide whether ByConity is the way forward
    • 🟒 Load it with data, set up
    • 🟒 Test performance, test the functionality/compatibility gaps
  • IF ByConity works, migrate over to it
    • 🟒 Enumerate all functionality that doesn’t work and update the functions/contribute to ByConity
    • 🟒 Syntax
    • 🟑 If it works on metal, put it in k8s with Karpenter
    • 🟑 Evaluate which nodes we should use
  • IF ByConity doesn’t work, reshard US to look like EU cluster
    • 🟑 All clusters (Dev, US, EU) should be consistent in shape and topology. This will make it easier to manage and maintain the clusters and apply learnings from one cluster to another.
    • 🟒 We want all cluster operations to be automated and managed through some form of infra as code that is available in source control.
    • 🟑 Schema management on ClickHouse should be entirely automated and managed through source control with no exceptions. This includes Coordinator schemas.
    • 🟒 We should be able to spin up and down replicas of any cluster with no manual intervention.
    • 🟒 We should be able to upgrade ClickHouse versions with no manual intervention.
    • 🟑 We should have tooling / runbooks for resharding (if we continue down the current coordinator path)

Board

https://github.com/orgs/PostHog/projects/85/views/2

Retro

@Daesgar - There have been changes to our scope. We have changed our scope by 1/2 just because of changing priorities and fires. Feeling comfortable. Able to do config automation and provide value in the first sprint. Working on the backups. Needs more context for the rest of how things work at PH (like the plugin server). Sometimes it's hard to get focus on something. When a question comes up in the chat there is ambiguity on whether it's something urgent or something to focus on.

@fuziontech - Overall I think this sprint went amazingly. Having 2x the firepower is a hack. Getting a lot more done than even my highest expectations. Having ~two incidents was less than ideal though for a first sprint.

  • [ ] πŸ“Ÿ Monitoring and Alerting on EU Coordinator
  • [ ] ⏩ Move parts around so last 3 months of data are on NVME on US @Daesgar
  • [ ] Retire old Offline Nodes on US Cluster @fuziontech
  • [ ] Remove projections in EU on events table @fuziontech
  • [ ] πŸ—‘οΈ Delete persons on teams that are still ingesting data (for personless events) @fuziontech
  • [ ] Configs in Ansible for ClickHouse EU Coordinator @Daesgar
  • [ ] Configs in Ansible for ClickHouse US @Daesgar
  • [x] πŸ§ͺ Test ~~incremental~~ backup restores
  • [x] Major fixes to HouseWatch backups @Daesgar (unplanned but needed)
  • [x] πŸƒ 2 new i4i.metal replicas for US
  • [x] Configs in Ansible for ClickHouse EU @Daesgar
  • [x] πŸ”₯ Kafka consumer fire recovery and initial debugging (major distraction)

High priority

image

fuziontech avatar Jul 03 '24 14:07 fuziontech

Team ~~web analytics~~ session table

Support hero: @robbie-c

Retro

Session table PR got merged, we are dogfooding, I'm fixing issues as they come up.

Had some detailed customer interactions around channel type attribution. One customer sent me a spreadsheet of their GA compared with us. We're pretty close but there were a few differences that I was able to fix or help them fix. A few other support tickets have asked for help with this, so I'm adding a session attribution debugger.

Tasks

🟒 Get session table v2 PR over the line 🟒 Start backfilling, prioritising EU, and team 2 on US for dogfooding πŸ†• Help customers debug attribution πŸ†•πŸŸ’ Add live session count

Stretch

πŸ”΄ Get a versions of WA up that is terrifyingly fast because it can just use the sessions table + it can sample

OKR

  1. Make querying fast enough for large customers
  2. Heavily requested features
  3. Improve synergy with other products
  4. Product and growth

High priority

  • Figure out difference between queries with session v1 and v2
  • Clear out the support queue
  • Finish the attribution debug tool
  • Improve the refresh logic

Ongoing

  • In the background, continue to backfill the sessions table

robbie-c avatar Jul 03 '24 15:07 robbie-c

Team Feature Success

Support hero: @Phanatic Days off: Juraj: 1 days Phani: 0 days Dylan: 1 days Neil: 2 weeks

Retro

  • event based triggers with actions and filters - @Phanatic - action based filters are ready, tests in progress!
  • Surveys branching logic release - @jurajmajerik - βœ… - e2e tests in progress
  • no-code experiments RFC - @jurajmajerik ❌
  • https://github.com/PostHog/posthog/issues/20851 - @dmarticus - PR is in review, should be sorted by Fri
  • https://github.com/PostHog/posthog/issues/22516 - @dmarticus - PR out by Friday
  • https://github.com/PostHog/posthog/issues/22131 continue on rewrite - get flags from db, handle cohorts - @neilkakkar -> πŸ€’βŒ

Hang over items from previous sprint


OKRs

  1. Make sure feature flags can handle 10x current scale
  2. No-code experiments
  3. Split out experiments into its own product

High priority

  • event based triggers for surveys - filters - @Phanatic
  • no-code experiments RFC - @Phanatic
  • Some experiments UX fixes (banners, display MDE next to the progress bar + explain), docs refresh - @jurajmajerik
  • Setup instrumentation for flip-flopping problem of experiment significance - @jurajmajerik
  • https://github.com/PostHog/posthog/issues/22131 continue on rewrite - @dmarticus

Low priority / side quests / maybe Neil will get to this next year

  • Temporal queues for feature success - @neilkakkar

neilkakkar avatar Jul 03 '24 15:07 neilkakkar

Team Growth

Retro

Retro items
  • [x] Q3 planning
  • @raquelmsmith
    • [x] Support for first week
    • [x] Pricing page experiments - iterate here with cory and eli until it's done
    • [x] Stay on top of revenue issues
    • [x] Start working on toolbar dashboard template thing
    • [x] Keep on top of personless comms and customer issues and metrics
    • [x] Lots of interviews...
  • @zlwaterfield
    • [ ] Complete subscribe to all products
      • [x] frontend changes
      • [x] release under feature flag to new users
      • [ ] backfill existing users and communicate with them
      • [x] (if time permits - probably next sprint) cleanup! remove/clean single product subscribe code where we can.
    • [x] Start on the Stripe metadata changes - close RFC, updates to Zapier, work on backfill, etc.
    • [ ] re-run the plans map and compare with the new auto-cancel functionality

Q3 Goals

βœ…=finished 🟑=in progress πŸ”΄=won't finish βšͺ=not started

  1. 🟑 Make onboarding awesome for Product analytics and Data warehouse (Raquel)
  2. βšͺ Support self-serve annual commitments (Zach)
  3. βšͺ Dive into the data to understand our billing metrics and customers better (Zach)
  4. 🟑 Launch pricing for data warehouse (Raquel)
  5. 🟑 Hire 2 people (one for billing, one for auth/permissions focus)

This sprint

High priority

  • @raquelmsmith (support first week, on-call second week)
    • Personless events launch
      • [x] Oversee pricing calc changes, keep iterating until sales feels like it's good and we feel like it works for us as well
      • [ ] If above is completed, make sure comms are sent out
      • [ ] Figure out if we an roll default out to everyone
    • Data warehouse pricing
      • [x] Launch it for non-beta-users
    • Dashboard templates in onboarding
      • [ ] Fix error that happens after creation, merge https://github.com/PostHog/posthog/pull/23069
      • [ ] Add flow to launch toolbar from in-app
    • Hiring
    • Project-access-on-invites
      • [x] Do some digging to see what this entails (I don't think it will be involved or difficult)
  • @zlwaterfield (on call first first week - support second week)
    • subscribe to all products
      • [ ] Run backfill for subscribe to all products and notify users
      • [x] Remove feature flag code for subscribe to all products and cleanup code
    • stripe startup metadata
      • [ ] Finish stripe metadata clean - 20-30 left to manual fixes + a few hours of manual checks
      • [x] Build a startup plan dashboard in dashboard in PostHog
    • misc
      • [x] Think through "Free / paid - same feature-set"
      • [ ] Add at least one E2E SAML test
      • [ ] Run backfill for starting/ending backfill bug
      • [ ] Re-run the plans map compare
      • [x] Deprecate billing v2 (PR done just need to merge)

raquelmsmith avatar Jul 03 '24 15:07 raquelmsmith

Team Infra

OKR

  1. 🦹 Zero-trust security
  2. πŸ€“ 10x Developer Experience
  3. πŸ’ͺ Every service lives and dies alone
  4. πŸ’° Save big on cost

High priority

  • Reverse proxy sharding approach + roadmap @frankh
  • Billing alerts and follow up from incident issues @danielxnj
  • Postgres issue investigation in EU - bigint etc @danielxnj
  • Start planning out security group changes @danielxnj
  • VPA grafana chart changes in all regions @ZeleniJure
  • Autoscaling based on celery queue depth @ZeleniJure

benjackwhite avatar Jul 08 '24 14:07 benjackwhite