self-hosted
self-hosted copied to clipboard
Failed migrations and KeyError: 'query' after upgrade
Self-Hosted Version
21.7.0
CPU Architecture
x86_64
Docker Version
19.03.8
Docker Compose Version
1.29.2
Steps to Reproduce
On CentOS 7, upgrade from 21.4.1 to 21.6.3, then to 22.7.0 (as per the "Hard Stops" docs).
Upgrade to 21.6.3 went fine, but when upgrading to 22.7.0, there were some errors during migration. Recreating the kafka/zookeeper volumes lets Sentry run normally for a few hours, then the subscription consumers fails with "KeyError: 'query'".
Following the steps mentioned in #1249 (recreating the volumes again) lets it run normally for another few hours, then it fails with the same error again.
Expected Result
Migrations to be fine and incoming errors to not stop being processed.
Actual Result
The errors encountered during migrating from 21.6.3 to 22.7.0, during Setting up / migrating database ... (full install log in attachments):
ls: cannot access '/usr/local/share/ca-certificates/': Operation not permitted
sentry/requirements.txt is deprecated, use sentry/enhance-image.sh - see https://github.com/getsentry/self-hosted#enhance-sentry-image
stat: cannot statx '/data': Operation not permitted
...and:
Applying sentry.0233_recreate_subscriptions_in_snuba...Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/sentry/migrations/0233_recreate_subscriptions_in_snuba.py", line 29, in migrate_subscriptions
subscription_id = _create_in_snuba(subscription)
File "/usr/local/lib/python3.8/site-packages/sentry/snuba/tasks.py", line 192, in _create_in_snuba
entity_subscription = get_entity_subscription_from_snuba_query(
File "/usr/local/lib/python3.8/site-packages/sentry/snuba/entity_subscription.py", line 581, in get_entity_subscription_from_snuba_query
SnubaQuery.Type(snuba_query.type),
AttributeError: 'SnubaQuery' object has no attribute 'type'
07:25:48 [ERROR] root: failed to recreate 0/c2c87862f21d11ec9ae50242ac120002: 'SnubaQuery' object has no attribute 'type'
The consumer errors after running for a few hours are the same as in #1249 .
Changes made to docker-compose.yml:
diff --git a/docker-compose.yml b/docker-compose.yml
index c35dc3a..6bb6835 100644
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -18,7 +18,8 @@ x-sentry-defaults: &sentry_defaults
<<: *restart_policy
image: sentry-self-hosted-local
# Set the platform to build for linux/arm64 when needed on Apple silicon Macs.
- platform: ${DOCKER_PLATFORM:-}
+ #platform: ${DOCKER_PLATFORM:-}
+ #platform: "linux/amd64"
build:
context: ./sentry
args:
@@ -58,6 +59,7 @@ x-sentry-defaults: &sentry_defaults
PYTHONUSERBASE: "/data/custom-packages"
SENTRY_CONF: "/etc/sentry"
SNUBA: "http://snuba-api:1218"
+ GEOIP_PATH_MMDB: '/geoip/GeoLite2-City.mmdb'
# Force everything to use the system CA bundle
# This is mostly needed to support installing custom CA certs
# This one is used by botocore
@@ -68,12 +70,14 @@ x-sentry-defaults: &sentry_defaults
GRPC_DEFAULT_SSL_ROOTS_FILE_PATH_ENV_VAR: *ca_bundle
# Leaving the value empty to just pass whatever is set
# on the host system (or in the .env file)
- SENTRY_EVENT_RETENTION_DAYS:
+ SENTRY_EVENT_RETENTION_DAYS: 56
SENTRY_MAIL_HOST:
volumes:
- - "sentry-data:/data"
+ - "/data/tncdata/sentry/sentry-data:/data"
+ #- "./sentry-data:/data"
- "./sentry:/etc/sentry"
- - "./geoip:/geoip:ro"
+ - "/data/tncdata/sentry/geoip:/geoip"
+ #- "./geoip:/geoip"
- "./certificates:/usr/local/share/ca-certificates:ro"
x-snuba-defaults: &snuba_defaults
<<: *restart_policy
@@ -94,12 +98,13 @@ x-snuba-defaults: &snuba_defaults
UWSGI_DISABLE_LOGGING: "true"
# Leaving the value empty to just pass whatever is set
# on the host system (or in the .env file)
- SENTRY_EVENT_RETENTION_DAYS:
+ SENTRY_EVENT_RETENTION_DAYS: 56
services:
smtp:
<<: *restart_policy
image: tianon/exim4
- hostname: "${SENTRY_MAIL_HOST:-}"
+ #hostname: "${SENTRY_MAIL_HOST:-}"
+ hostname: "sentry2.redacted.com"
volumes:
- "sentry-smtp:/var/spool/exim4"
- "sentry-smtp-log:/var/log/exim4"
@@ -117,7 +122,7 @@ services:
<<: *healthcheck_defaults
test: redis-cli ping
volumes:
- - "sentry-redis:/data"
+ - "/data/tncdata/sentry/sentry-redis:/data"
ulimits:
nofile:
soft: 10032
@@ -128,7 +133,8 @@ services:
healthcheck:
<<: *healthcheck_defaults
# Using default user "postgres" from sentry/sentry.conf.example.py or value of POSTGRES_USER if provided
- test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-postgres}"]
+ #test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-postgres}"]
+ test: ["CMD-SHELL", "pg_isready -U sentryuser"]
command:
[
"postgres",
@@ -143,7 +149,7 @@ services:
POSTGRES_HOST_AUTH_METHOD: "trust"
entrypoint: /opt/sentry/postgres-entrypoint.sh
volumes:
- - "sentry-postgres:/var/lib/postgresql/data"
+ - "/data/tncdata/sentry/sentry-postgres:/var/lib/postgresql/data"
- type: bind
read_only: true
source: ./postgres/
@@ -158,8 +164,8 @@ services:
ZOOKEEPER_TOOLS_LOG4J_LOGLEVEL: "WARN"
KAFKA_OPTS: "-Dzookeeper.4lw.commands.whitelist=ruok"
volumes:
- - "sentry-zookeeper:/var/lib/zookeeper/data"
- - "sentry-zookeeper-log:/var/lib/zookeeper/log"
+ - "/data/tncdata/sentry/sentry-zookeeper:/var/lib/zookeeper/data"
+ - "/data/tncdata/sentry/sentry-zookeeper-log:/var/lib/zookeeper/log"
- "sentry-secrets:/etc/zookeeper/secrets"
healthcheck:
<<: *healthcheck_defaults
@@ -184,8 +190,8 @@ services:
KAFKA_LOG4J_ROOT_LOGLEVEL: "WARN"
KAFKA_TOOLS_LOG4J_LOGLEVEL: "WARN"
volumes:
- - "sentry-kafka:/var/lib/kafka/data"
- - "sentry-kafka-log:/var/lib/kafka/log"
+ - "/data/tncdata/sentry/sentry-kafka:/var/lib/kafka/data"
+ - "/data/tncdata/sentry/sentry-kafka-log:/var/lib/kafka/log"
- "sentry-secrets:/etc/kafka/secrets"
healthcheck:
<<: *healthcheck_defaults
@@ -197,14 +203,15 @@ services:
context:
./clickhouse
args:
- BASE_IMAGE: "${CLICKHOUSE_IMAGE:-}"
+ #BASE_IMAGE: "${CLICKHOUSE_IMAGE:-}"
+ BASE_IMAGE: "yandex/clickhouse-server:20.3.9.70"
ulimits:
nofile:
soft: 262144
hard: 262144
volumes:
- - "sentry-clickhouse:/var/lib/clickhouse"
- - "sentry-clickhouse-log:/var/log/clickhouse-server"
+ - "/data/tncdata/sentry/sentry-clickhouse:/var/lib/clickhouse"
+ - "/data/tncdata/sentry/sentry-clickhouse-log:/var/log/clickhouse-server"
- type: bind
read_only: true
source: ./clickhouse/config.xml
@@ -213,7 +220,7 @@ services:
# This limits Clickhouse's memory to 30% of the host memory
# If you have high volume and your search return incomplete results
# You might want to change this to a higher value (and ensure your host has enough memory)
- MAX_MEMORY_USAGE_RATIO: 0.3
+ MAX_MEMORY_USAGE_RATIO: 0.4
healthcheck:
test:
[
@@ -388,11 +395,17 @@ volumes:
sentry-symbolicator:
external: true
+ sentry-zookeeper-log:
+ external: true
+ sentry-kafka-log:
+ external: true
+ sentry-clickhouse-log:
+ external: true
+ geoip:
+ external: true
+
# These store ephemeral data that needn't persist across restarts.
sentry-secrets:
sentry-smtp:
sentry-nginx-cache:
- sentry-zookeeper-log:
- sentry-kafka-log:
sentry-smtp-log:
- sentry-clickhouse-log:
Incoming errors are affected:

But transactions are not affected (except when Sentry was down for a while):

Is there a way to try to re-run the migrations? Or rollback?
PS: We run the postgres database on a separate host, version 9.5.14.
Hm, based on
stat: cannot statx '/data': Operation not permitted
and your docker-compose.yml changes, it seems that the bind mount you're using doesn't have the correct permissions for clickhouse or some other service. I would recommend checking that the snuba user in the snuba container has access to that location.
Once you've done that, you should be able to re-run ./install.sh and it should fix things if I'm not mistaken.
This issue has gone three weeks without activity. In another week, I will close it.
But! If you comment or otherwise update it, I will reset the clock, and if you label it Status: Backlog or Status: In Progress, I will leave it alone ... forever!
"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀