posthog icon indicating copy to clipboard operation
posthog copied to clipboard

hackathon(bi): Add direct SQL query support for external database

Open andrewm4894 opened this issue 1 month ago β€’ 7 comments

MetaHog - Direct Database Connection BI Layer

Enable users to connect to external databases (starting with Postgres) and run SQL queries directly against them from the PostHog SQL editor, without requiring batch import to ClickHouse.

Documentation

Features

  • [x] Add Postgres connection via UI with "Query only" checkbox
  • [x] Select connected database as query target in SQL editor
  • [x] Execute SQL queries directly against connected database
  • [x] See schema (tables/columns) for connected databases with ⚑ indicator
  • [x] Results display in same format as HogQL queries
  • [x] DirectQuery integrated into unified /query/ API

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      SQL Editor UI                          β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚ Database Select β”‚  β”‚ Monaco Editor                    β”‚ β”‚
β”‚  β”‚ - PostHog (HogQL)β”‚  β”‚ SELECT * FROM postgres.film     β”‚ β”‚
β”‚  β”‚ - Pagila (PG) βš‘β”‚  β”‚ LIMIT 10;                        β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β–Ό                               β–Ό
      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”               β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
      β”‚  ClickHouse  β”‚               β”‚   External   β”‚
      β”‚  (PostHog)   β”‚               β”‚   Postgres   β”‚
      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Changes

Backend:

  • ExternalDataSource.query_only field - skip data sync for query-only sources
  • DirectQueryExecutor service - execute SQL against external Postgres
  • DirectQuery kind in unified /query/ API
  • /direct_query/sources and /direct_query/schema/:id endpoints

Frontend:

  • DatabaseSelector component - switch between HogQL and external databases
  • directQueryLogic.ts - Kea logic for direct query state
  • Schema browser integration with ⚑ icon for direct query tables
  • Automatic table prefix stripping (postgres.film β†’ film)

Security

  • βœ… Query timeout protection (30s default) via SET statement_timeout
  • βœ… Read-only mode enforced via SET default_transaction_read_only = ON
  • βœ… Error message sanitization (no credential leaks in responses)
  • βœ… Structured logging with structlog for audit trail
  • βœ… Team isolation enforced on all endpoints

Testing

16 unit tests covering:

  • API validation (source_id, sql required)
  • Team isolation and access control
  • Query execution happy path and error handling
  • Schema retrieval
  • Connection parameter extraction
  • Timeout and read-only enforcement verification
  • Error message sanitization

Run: pytest products/data_warehouse/backend/api/test/test_direct_query.py -v

Manual Testing

  1. Start demo Pagila database: cd ~/Documents/GitHub/pagila && docker compose up -d
  2. Add Postgres source with "Query only" checkbox
  3. Select database from dropdown
  4. Run: SELECT * FROM postgres.film LIMIT 10

πŸ€– Generated with Claude Code

andrewm4894 avatar Dec 09 '25 16:12 andrewm4894

Size Change: +668 B (+0.02%)

Total Size: 3.51 MB

ℹ️ View Unchanged
Filename Size Change
frontend/dist/toolbar.js 3.51 MB +668 B (+0.02%)

compressed-size-action

github-actions[bot] avatar Dec 09 '25 16:12 github-actions[bot]

Migration SQL Changes

Hey πŸ‘‹, we've detected some migrations on this PR. Here's the SQL output for each migration, make sure they make sense:

products/data_warehouse/backend/migrations/0012_externaldatasource_query_only.py

BEGIN;
--
-- Add field query_only to externaldatasource
--
ALTER TABLE "posthog_externaldatasource" ADD COLUMN "query_only" boolean DEFAULT false NOT NULL;
ALTER TABLE "posthog_externaldatasource" ALTER COLUMN "query_only" DROP DEFAULT;
COMMIT;

products/data_warehouse/backend/migrations/0013_add_is_direct_query_to_table.py

BEGIN;
--
-- Add field is_direct_query to datawarehousetable
--
ALTER TABLE "posthog_datawarehousetable" ADD COLUMN "is_direct_query" boolean DEFAULT false NOT NULL;
ALTER TABLE "posthog_datawarehousetable" ALTER COLUMN "is_direct_query" DROP DEFAULT;
--
-- Alter field format on datawarehousetable
--
-- (no-op)
--
-- Alter field url_pattern on datawarehousetable
--
-- (no-op)
COMMIT;

Last updated: 2025-12-09 21:23 UTC (be3b27c)

github-actions[bot] avatar Dec 09 '25 16:12 github-actions[bot]

πŸ” Migration Risk Analysis

We've analyzed your migrations for potential risks.

Summary: 1 Safe | 1 Needs Review | 0 Blocked

⚠️ Needs Review

May have performance impact

data_warehouse.0013_add_is_direct_query_to_table
  └─ #1 βœ… AddField
     Adding NOT NULL field with constant default (safe in PG11+)
     model: datawarehousetable, field: is_direct_query
  └─ #2 ⚠️ AlterField
     Field alteration may cause table locks or data loss (check if changing type or constraints)
     model: datawarehousetable, field: format, field_type: CharField
  └─ #3 ⚠️ AlterField
     Field alteration may cause table locks or data loss (check if changing type or constraints)
     model: datawarehousetable, field: url_pattern, field_type: CharField

βœ… Safe

Brief or no lock, backwards compatible

data_warehouse.0012_externaldatasource_query_only
  └─ #1 βœ… AddField
     Adding NOT NULL field with constant default (safe in PG11+)
     model: externaldatasource, field: query_only

Last updated: 2025-12-09 21:23 UTC (be3b27c)

github-actions[bot] avatar Dec 09 '25 16:12 github-actions[bot]

Query snapshots: Backend query snapshots updated

Changes: 3 snapshots (3 modified, 0 added, 0 deleted)

What this means:

  • Query snapshots have been automatically updated to match current output
  • These changes reflect modifications to database queries or schema

Next steps:

  • Review the query changes to ensure they're intentional
  • If unexpected, investigate what caused the query to change

Review snapshot changes β†’

posthog-bot avatar Dec 09 '25 17:12 posthog-bot

Wiz Scan Summary

Scanner Findings
Vulnerability Finding Vulnerabilities -
Data Finding Sensitive Data 1 Info
Secret Finding Secrets -
IaC Misconfiguration IaC Misconfigurations -
SAST Finding SAST Findings -
Total 1 Info

View scan details in Wiz

To detect these findings earlier in the dev lifecycle, try using Wiz Code VS Code Extension.

wiz-7ad640923b[bot] avatar Dec 09 '25 19:12 wiz-7ad640923b[bot]

⏭️ Skipped snapshot commit because branch advanced to bcdaaa6 while workflow was testing 3a40c78.

The new commit will trigger its own snapshot update workflow.

If you expected this workflow to succeed: This can happen due to concurrent commits. To get a fresh workflow run, either:

  • Merge master into your branch, or
  • Push an empty commit: git commit --allow-empty -m 'trigger CI' && git push

posthog-bot avatar Dec 09 '25 20:12 posthog-bot

Query snapshots: Backend query snapshots updated

Changes: 2 snapshots (2 modified, 0 added, 0 deleted)

What this means:

  • Query snapshots have been automatically updated to match current output
  • These changes reflect modifications to database queries or schema

Next steps:

  • Review the query changes to ensure they're intentional
  • If unexpected, investigate what caused the query to change

Review snapshot changes β†’

posthog-bot avatar Dec 09 '25 20:12 posthog-bot

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week. If you want to permanentely keep it open, use the waiting label.

github-actions[bot] avatar Dec 17 '25 07:12 github-actions[bot]