posthog
posthog copied to clipboard
0030_created_at_persons_and_groups_on_events.py wants to add column to "write_events" but this table does not exist when CLICKHOUSE_REPLICATE=false
Bug description
- I'm trying to start a fresh setup with 2 machines , 1 containing the datastores (clickhouse+kafka) , 1 containing the code
How to reproduce
- use the following docker-compose.yml
version: "3"
services:
redis:
# pinned to version 6 as the latest one is 7+, not supported by the Posthog pinned version below
image: "redis:6-alpine"
container_name: posthog_redis
restart: always
web: &web
image: posthog/posthog:latest-release
restart: on-failure
environment:
SKIP_SERVICE_VERSION_REQUIREMENTS: 'true'
DISABLE_SECURE_SSL_REDIRECT: 'true'
DATABASE_URL: "URL_TO_YOUR_POSGRES"
CLICKHOUSE_HOST: 'IP_OF_CLICKHOUSE'
CLICKHOUSE_SECURE: 'false'
CLICKHOUSE_VERIFY: 'false'
CLICKHOUSE_REPLICATION: 'false'
KAFKA_URL: 'kafka://IP_OF_KAFKA'
REDIS_URL: 'redis://redis:6379/'
SECRET_KEY: "somethingsomething"
SITE_URL: "https://something.example.com"
DEPLOYMENT: hello_i_am_trapped_in_an_environment_variable
depends_on:
- redis
worker:
<<: *web
image: posthog/posthog:latest-release
command: ./bin/docker-worker-celery --with-scheduler
plugins:
image: posthog/posthog:latest-release
command: ./bin/plugin-server --no-restart-loop
restart: on-failure
environment:
SKIP_SERVICE_VERSION_REQUIREMENTS: 'true'
DISABLE_SECURE_SSL_REDIRECT: 'true'
SECRET_KEY: "somethingsomething"
DATABASE_URL: "URL"
KAFKA_HOSTS: '10.40.0.4:9092'
REDIS_URL: 'redis://redis:6379/'
CLICKHOUSE_HOST: 'IP_OF_CLICKHOUSE'
CLICKHOUSE_SECURE: 'false'
CLICKHOUSE_VERIFY: 'false'
CLICKHOUSE_REPLICATION: 'false'
depends_on:
- redis
asyncmigrationscheck:
<<: *web
command: python manage.py run_async_migrations --check
restart: on-failure
- docker-compose up
web will then display
web_1 | Traceback (most recent call last):
web_1 | File "manage.py", line 21, in <module>
web_1 | main()
web_1 | File "manage.py", line 17, in main
web_1 | execute_from_command_line(sys.argv)
web_1 | File "/usr/local/lib/python3.8/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
web_1 | utility.execute()
web_1 | File "/usr/local/lib/python3.8/site-packages/django/core/management/__init__.py", line 413, in execute
web_1 | self.fetch_command(subcommand).run_from_argv(self.argv)
web_1 | File "/usr/local/lib/python3.8/site-packages/django/core/management/base.py", line 354, in run_from_argv
web_1 | self.execute(*args, **cmd_options)
web_1 | File "/usr/local/lib/python3.8/site-packages/django/core/management/base.py", line 398, in execute
web_1 | output = self.handle(*args, **options)
web_1 | File "/code/posthog/management/commands/migrate_clickhouse.py", line 42, in handle
web_1 | self.migrate(CLICKHOUSE_HTTP_URL, options)
web_1 | File "/code/posthog/management/commands/migrate_clickhouse.py", line 80, in migrate
web_1 | database.migrate(MIGRATIONS_PACKAGE_NAME, options["upto"], replicated=CLICKHOUSE_REPLICATION)
web_1 | File "/usr/local/lib/python3.8/site-packages/infi/clickhouse_orm/database.py", line 352, in migrate
web_1 | operation.apply(self)
web_1 | File "/usr/local/lib/python3.8/site-packages/infi/clickhouse_orm/migrations.py", line 250, in apply
web_1 | self._func(database)
web_1 | File "/code/posthog/clickhouse/migrations/0030_created_at_persons_and_groups_on_events.py", line 21, in add_columns_to_required_tables
web_1 | sync_execute(ADD_COLUMNS_BASE_SQL.format(table="writable_events", cluster=CLICKHOUSE_CLUSTER))
web_1 | File "/code/posthog/client.py", line 161, in sync_execute
web_1 | raise err
web_1 | posthog.errors.CHQueryErrorUnknownTable: Code: 60.
web_1 | DB::Exception: There was an error on [10.40.0.4:9000]: Code: 60. DB::Exception: Table default.writable_events doesn't exist. (UNKNOWN_TABLE) (version 22.8.5.29 (official build)). Stack trace:
web_1 |
web_1 | 0. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0x8dcd368 in /usr/bin/clickhouse
web_1 | 1. DB::DDLQueryStatusSource::generate() @ 0x11da27ac in /usr/bin/clickhouse
web_1 | 2. DB::ISource::tryGenerate() @ 0x127898ac in /usr/bin/clickhouse
web_1 | 3. DB::ISource::work() @ 0x12789390 in /usr/bin/clickhouse
web_1 | 4. DB::ExecutionThreadContext::executeTask() @ 0x127a35b4 in /usr/bin/clickhouse
web_1 | 5. DB::PipelineExecutor::executeStepImpl(unsigned long, std::__1::atomic<bool>*) @ 0x1279862c in /usr/bin/clickhouse
web_1 | 6. DB::PipelineExecutor::executeImpl(unsigned long) @ 0x12797ccc in /usr/bin/clickhouse
web_1 | 7. DB::PipelineExecutor::execute(unsigned long) @ 0x12797a94 in /usr/bin/clickhouse
web_1 | 8. ? @ 0x127a6e0c in /usr/bin/clickhouse
web_1 | 9. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0x8e7f950 in /usr/bin/clickhouse
web_1 | 10. ? @ 0x8e82844 in /usr/bin/clickhouse
web_1 | 11. start_thread @ 0x7624 in /usr/lib/aarch64-linux-gnu/libpthread-2.31.so
web_1 | 12. ? @ 0xd149c in /usr/lib/aarch64-linux-gnu/libc-2.31.so
Environment
- [ ] PostHog Cloud
- [ x] self-hosted PostHog, version/commit: please provide
Additional context
it seems to be that this migration hardcode this table
def add_columns_to_required_tables(_):
sync_execute(ADD_COLUMNS_BASE_SQL.format(table="events", cluster=CLICKHOUSE_CLUSTER))
sync_execute(ADD_COLUMNS_BASE_SQL.format(table="writable_events", cluster=CLICKHOUSE_CLUSTER))
sync_execute(ADD_COLUMNS_BASE_SQL.format(table="sharded_events", cluster=CLICKHOUSE_CLUSTER))
while in other places we have:
posthog/models/event/sql.py
WRITABLE_EVENTS_DATA_TABLE = lambda: "writable_events" if settings.CLICKHOUSE_REPLICATION else EVENTS_DATA_TABLE(
Manually removing these lines in the migration seems to solve the issue
sync_execute(ADD_COLUMNS_BASE_SQL.format(table="writable_events", cluster=CLICKHOUSE_CLUSTER))
sync_execute(ADD_COLUMNS_BASE_SQL.format(table="sharded_events", cluster=CLICKHOUSE_CLUSTER))
so I guess they should be behind a if ?
Thank you for your bug report – we love squashing them!
Hi! Did you perhaps have some previous docker-compose volumes laying around?
If the latter, try docker-compose rm -v and try again.
@macobo thanks for your reply, as you can see in my docker-compose.yml I don't have any volume
but everytime I have doing a drop database on postgres and clickhouse (which are installed on respectively RDS and a dedicated EC2 )
Thanks for the clarification! I suspect there might be something missing
- What's the command you're using for dropping the database on clickhouse? Note that clickhouse doesn't actually drop replicated data immediately and it doesn't clear zookeeper, potentially causing issues if the same database/paths are reused.
- Could you check the output of
system.tablesclickhouse table after the drop for anyinfi_related tables? - Could you output the full logs for the web pod during migration?
In general though docker-compose isn't suited for a lot of customization (as you're doing here) and issues might arise as our stack evolves. I would recommend checking out our k8s-based installation as well, which is better suited for these setups: https://posthog.com/docs/self-host https://github.com/PostHog/charts-clickhouse/
I will check that, I will try on a dedicated new machine so that we're sure there's no state floating around.
as for K8S , it will really increase the cost for nothing (I have a single clickhouse node)
Actually I was even more thinking to go in the oppostive direction and see if posthog could work without a "cluster" (as anyway it's a single node)
@macobo to be honest I haven't time to get back a new instance running but for what it's worth the migration 0027 has this:
def add_columns_to_required_tables(_):
sync_execute(ADD_COLUMNS_BASE_SQL.format(table="events", cluster=CLICKHOUSE_CLUSTER))
if clickhouse_is_replicated():
sync_execute(ADD_COLUMNS_BASE_SQL.format(table="writable_events", cluster=CLICKHOUSE_CLUSTER))
sync_execute(ADD_COLUMNS_BASE_SQL.format(table="sharded_events", cluster=CLICKHOUSE_CLUSTER))
adding this in 0030 make it works, so I wouldn't see why it would be needed for 0027 and won't for 0030 (especially as not having it create a bug on my side)
Hi! Just saw your PR.
This is missing cleanup on our end but we're planning on removing CLICKHOUSE_REPLICATION environment variable going forward, so the fix would land you in a situation where you're unupgradable. Note that it being set true doesn't actually require setting up a full cluster, however it does require a working zookeeper setup.
ahhh fortunately you tell me that before I upgraded our current installation ^^' so i guess i'm back to square one ( i got other issues with the migrations when having it turned to true.. so expect other issues from me ^^" )