self-hosted icon indicating copy to clipboard operation
self-hosted copied to clipboard

Unable to fetch attachments

Open wimg opened this issue 3 months ago • 40 comments

Self-Hosted Version

25.9.0

CPU Architecture

x86_64

Docker Version

28.5.0

Docker Compose Version

28.5.0

Machine Specification

  • [x] My system meets the minimum system requirements of Sentry

Steps to Reproduce

  1. Upgrade from Sentry 25.1.0 to 25.8.0
  2. Go to an entry from before the upgrade
  3. Notice "An error occurred while fetching attachments" and "There was an error loading data"

Expected Result

All blocks loading correctly

Actual Result

Image Image

Event ID

No response

wimg avatar Oct 06 '25 10:10 wimg

I noticed in the meantime that I was supposed to upgrade to 25.5.1 first. But without reading the upgrade instructions every single time, people miss that. So doesn't it make sense to put a check in the install.sh script, so it verifies that you are indeed not skipping one of the releases that you must pass by ?

Also, not sure how to proceed now, since I suppose downgrading won't work. Any suggestions ?

wimg avatar Oct 06 '25 15:10 wimg

I have the exactly the same issue. Upgrading to 25.8 is fine, but after upgrading to 25.9, this issue occurs. And BTW I have tried, downgrade do work, I have just tried. @wimg You could downgrade to 25.8 for now. Upgrading to 25.9 still does not working for me.

myonlylonely avatar Oct 06 '25 15:10 myonlylonely

We're experiencing the same.

Reverted my VM to the 25.8 snapshot I made before upgrading and everything works as expected but upgrading by downloading the release from https://github.com/getsentry/self-hosted/releases/tag/25.9.0 and running ./install.sh results in the same issue as mentioned above.

Chasx avatar Oct 06 '25 21:10 Chasx

Hmm that makes it seem as if it's 25.9.0 issue, not the fact that I skipped 25.5.1 ?

wimg avatar Oct 06 '25 21:10 wimg

I’m not entirely sure, but I don’t think so. We’ve had several issues with older versions in the past because I hadn’t upgraded through all the ‘hard stops’ as documented in the Sentry upgrade guide: https://develop.sentry.dev/self-hosted/releases/#hard-stops. Earlier this year, I manually upgraded passing the versions listed in that document, and since then, upgrades have gone smoothly.

Before version 25.8, I believe we also ran other versions like 25.7 and 25.6, as I have snapshots from the months those versions were released. We usually create those snapshots when upgrading to a newer version.

Chasx avatar Oct 06 '25 21:10 Chasx

I upgraded from 25.7 to 25.9, and before this, I do follow the instructions to have the hard stop at 25.5 to 25.5.1. So I think this issue is only related to 25.9.

myonlylonely avatar Oct 07 '25 08:10 myonlylonely

I have some issue on 25.9 with "error occurred while fetching attachments". I thing correctly its associated with sentry-self-hosted-relay-1 I see in docker logs errors like

2025-10-08T13:46:40.636146Z ERROR relay_server::services::projects::source::upstream: error fetching project state 8b7620f43375fd8f779d094df8bb21af: deadline exceeded errors=0 pending=268 tags.did_error=false tags.was_pending=true tags.project_key="8b7620f43375fd8f779d094df8bb21af" 2025-10-08T13:46:40.636206Z ERROR relay_server::services::projects::cache::service: failed to fetch project from source: Fetch { project_key: ProjectKey("8b7620f43375fd8f779d094df8bb21af"), previous_fetch: None, initiated: Instant { tv_sec: 7284, tv_nsec: 747176400 }, when: Some(Instant { tv_sec: 8446, tv_nsec: 863228191 }), revision: Revision(None) } tags.project_key="8b7620f43375fd8f779d094df8bb21af" tags.has_revision=false error=upstream error failed to send message to service error.sources=[failed to send message to service]

How we can resolve this? Can I revert/downgrade to 25.8 by install.sh script?

cyfran avatar Oct 08 '25 13:10 cyfran

@cyfran No official response yet. But I did try to downgrade to 25.8 successfully using install.sh script. Just change the docker image tag in .env to 25.8.

myonlylonely avatar Oct 08 '25 14:10 myonlylonely

@cyfran BTW, did you change SENTRY_BIND in .env fromSENTRY_BIND=9000 to something like SENTRY_BIND=127.0.0.1:9000?

myonlylonely avatar Oct 08 '25 14:10 myonlylonely

No:

[root@log self-hosted]# cat .env
COMPOSE_PROJECT_NAME=sentry-self-hosted
# Set COMPOSE_PROFILES to "feature-complete" to enable all features
# To enable errors monitoring only, set COMPOSE_PROFILES=errors-only
# See https://develop.sentry.dev/self-hosted/experimental/errors-only/
COMPOSE_PROFILES=feature-complete
SENTRY_EVENT_RETENTION_DAYS=90
# You can either use a port number or an IP:PORT combo for SENTRY_BIND
# See https://docs.docker.com/compose/compose-file/#ports for more
SENTRY_BIND=9000
# Set SENTRY_MAIL_HOST to a valid FQDN (host/domain name) to be able to send emails!
# SENTRY_MAIL_HOST=example.com
SENTRY_IMAGE=ghcr.io/getsentry/sentry:25.9.0
SNUBA_IMAGE=ghcr.io/getsentry/snuba:25.9.0
RELAY_IMAGE=ghcr.io/getsentry/relay:25.9.0
SYMBOLICATOR_IMAGE=ghcr.io/getsentry/symbolicator:25.9.0
TASKBROKER_IMAGE=ghcr.io/getsentry/taskbroker:25.9.0
VROOM_IMAGE=ghcr.io/getsentry/vroom:25.9.0
UPTIME_CHECKER_IMAGE=ghcr.io/getsentry/uptime-checker:25.9.0
HEALTHCHECK_INTERVAL=30s
HEALTHCHECK_TIMEOUT=1m30s
HEALTHCHECK_RETRIES=10
HEALTHCHECK_START_PERIOD=10s
HEALTHCHECK_FILE_INTERVAL=60s
HEALTHCHECK_FILE_TIMEOUT=10s
HEALTHCHECK_FILE_RETRIES=3
HEALTHCHECK_FILE_START_PERIOD=600s
# Set SETUP_JS_SDK_ASSETS to 1 to enable the setup of JS SDK assets
# SETUP_JS_SDK_ASSETS=1
[root@log self-hosted]#

cyfran avatar Oct 08 '25 14:10 cyfran

Does the relay's upstream web container has any error messages?

myonlylonely avatar Oct 08 '25 14:10 myonlylonely

We also did not change the .env file in any way. We basically download the release, docker compose down on the running release, unpack and install.sh. After that, until 25.9, the new version just runs fine.

My colleague mentioned a change (in the .env ?) from postgres to pgbouncer but we have not investigated at all if that is related.

Chasx avatar Oct 08 '25 14:10 Chasx

Yes I have error 500

GET /api/0/organizations/.../issues/12306/events/recommended/?collapse=fullRelease HTTP/1.0" 500 72 GET /api/0/organizations/.../issues/12307/events/recommended/?collapse=fullRelease HTTP/1.0" 500 72

cyfran avatar Oct 08 '25 14:10 cyfran

@aldy505 Hi, is there anything we could try to solve or diagnose this issue?

myonlylonely avatar Oct 08 '25 14:10 myonlylonely

As I check errors match time with errors with in sentry-self-hosted-seaweedfs-1 with can not find data:

E1008 14:58:05.787156 stream.go:107 request_id:d4a56546-3b98-4135-936e-b7cea5dd25e6operation LookupFileId 5,70ff67d48924d4 failed, err: urls not found
E1008 14:58:05.787192 filer_server_handlers_read.go:262 request_id:d4a56546-3b98-4135-936e-b7cea5dd25e6 failed to prepare stream content /buckets/nodestore/nodestore/dbac2b5fcba7c7d23c6f33ea7fd437b8: operation LookupFileId 5,70ff67d48924d4 failed, err: urls not found
E1008 14:58:05.787213 common.go:310 ProcessRangeRequest: operation LookupFileId 5,70ff67d48924d4 failed, err: urls not found
E1008 14:58:09.051937 stream.go:107 request_id:54639a84-dd72-4e78-9345-dfc5e96d4c9coperation LookupFileId 5,70ff67d48924d4 failed, err: urls not found
E1008 14:58:09.051958 filer_server_handlers_read.go:262 request_id:54639a84-dd72-4e78-9345-dfc5e96d4c9c failed to prepare stream content /buckets/nodestore/nodestore/dbac2b5fcba7c7d23c6f33ea7fd437b8: operation LookupFileId 5,70ff67d48924d4 failed, err: urls not found
E1008 14:58:09.051973 common.go:310 ProcessRangeRequest: operation LookupFileId 5,70ff67d48924d4 failed, err: urls not found

config.yml

# Uploaded media uses these `filestore` settings. The available
# backends are either `filesystem` or `s3`.

filestore.backend: 'filesystem'
filestore.options:
  location: '/data/files'
dsym.cache-path: '/data/dsym-cache'
releasefile.cache-path: '/data/releasefile-cache'

# filestore.backend: 's3'
# filestore.options:
#   access_key: 'AKIXXXXXX'
#   secret_key: 'XXXXXXX'
#   bucket_name: 's3-bucket-name'

If I fully wipe /data in seaweedfs and rebuild it - it resolve issue for new events or not? In other words, what can be done to ensure that new messages arrive and are read correctly, even if old ones are unavailable?

cyfran avatar Oct 08 '25 15:10 cyfran

So it seems the new release introduced node storage(https://develop.sentry.dev/backend/application-domains/nodestore/),

The default backend simply stores them as gzipped blobs in in the ‘nodestore_node’ table of your default database.

I think the old events are not migrated to the new table?

myonlylonely avatar Oct 08 '25 15:10 myonlylonely

In the release description:

We're introducing SeaweedFS with its S3-compatible API for storing Nodestore data. This is opt-in, and you'll see a prompt during installation, similar to PGBouncer.

But I clearly not see this prompt.

myonlylonely avatar Oct 08 '25 15:10 myonlylonely

Yes I rebuild via install.sh and did not see any question about migration to S3 storage, but request point to like a S3 storage: "failed to prepare stream content /buckets/nodestore/nodestore/...

cyfran avatar Oct 08 '25 15:10 cyfran

I think the problem is that we have SENTRY_NODESTORE in sentry.conf.py. The migration script https://github.com/getsentry/self-hosted/blob/master/install/bootstrap-s3-nodestore.sh need it missing. We could try to delete Node Storage part in sentry.conf.py, and run install.sh again, and this time it should ask you about the auto migration of node storage.

myonlylonely avatar Oct 08 '25 15:10 myonlylonely

Here is what I tried:

  1. Delete s3://nodestore.
docker compose exec seaweedfs s3cmd --access_key=sentry --secret_key=sentry --no-ssl --region=us-east-1 --host=localhost:8333 --host-bucket='localhost:8333/%(bucket)' rb s3://nodestore
  1. Remove SENTRY_NODESTORE and SENTRY_NODESTORE_OPTIONS from sentry.conf.py.
  2. Run ./install.sh again.
  3. Press Y when Node store migration prompt appears.
  4. Do rest of the process as usual and start the containers.

But unfortunately this still does not work.😣 @BYK Can you help us?

myonlylonely avatar Oct 09 '25 04:10 myonlylonely

I think the old events are not migrated to the new table?

As seen in the release notes, I don't think so:

Good news: you won't lose your existing event details from Postgres until the retention period passes, and there's no need to migrate your Nodestore data from Postgres to S3.

Sounds like new events use Seaweed (if opted in) while old events stored in Postgres are accessed there and you're encouraged to fully switch to Seaweed when your retention period ends (so no more old events are stored in Postgres)

stevenobird avatar Oct 09 '25 05:10 stevenobird

@stevenobird Yes, but the problem is the old event attachment and stacktrace is not viewable now.

myonlylonely avatar Oct 09 '25 06:10 myonlylonely

@stevenobird There is a way to disable Seaweed?

myonlylonely avatar Oct 09 '25 06:10 myonlylonely

@stevenobird There is a way to disable Seaweed?

Can't say that if you can disable it after having it enabled in your Sentry instance, but by reading the changelog and upgrade path I would make the educated guess that you can use both until the bug is fixed:

You could try to set "write_through" to True in your SENTRY_NODESTORE_OPTIONS config as seen here: https://github.com/stayallive/sentry-nodestore-s3#installation

SENTRY_NODESTORE_OPTIONS = {
    "delete_through": True,     # delete through to the Django nodestore (delete object from S3 and Django)
    "write_through": True,      # write through to the Django nodestore (duplicate writes to S3 and Django)
    "read_through": True,       # read through to the Django nodestore (if object not found in S3)
   ...
}

By setting write_through to True, Nodestore will store events in both S3 and the old storage (Postgres). delete_through and read_through enable Nodestore to check Postgres for old events, so keep them set to True.
Please note that this will increase storage usage.

For you guys having the issue: Do you have Seaweed enabled and if yes, is at least read_through set to True to have at least old event attachments accessible?
The issue for new events is this bug: https://github.com/getsentry/self-hosted/pull/3991 SeaweedFS's data volume is empty after the service is restarted so keep that in mind.

I've just noticed that even my Sentry instance does not have those options set, but I have not yet faced this issue since my instance has not been restarted after the upgrade. The migration script should have added those options, but I don't know why that didn't happen here: https://github.com/getsentry/self-hosted/commit/84f904f7a15698cc5d8f0ce492612cd057f36721#diff-bc368aa756cdf15cdf4da2499bee17d06f84429144b47bc87af1ac867978a024R54-R60

stevenobird avatar Oct 09 '25 06:10 stevenobird

After applying the fix https://github.com/getsentry/self-hosted/pull/3991, I nucleared all the old events, now it works. Nuclearing all the old events is a bad choice, but these old events are not quite important to me. Obviously, this might not be acceptable for other people.

myonlylonely avatar Oct 10 '25 12:10 myonlylonely

@myonlylonely How to erase all old events? Via cleanup --days 1 & DELETE FROM nodestore_node WHERE timestamp < 'now'; or any other way?

cyfran avatar Oct 10 '25 12:10 cyfran

@cyfran

docker-compose exec worker bash
sentry cleanup --days 0

myonlylonely avatar Oct 10 '25 12:10 myonlylonely

After applying the fix #3991, I nucleared all the old events, now it works. Nuclearing all the old events is a bad choice, but these old events are not quite important to me. Obviously, this might not be acceptable for other people.

This shouldn't be done. For people already impacted, I believe we can just clear out the data on seaweed fs without removing all events using the sentry cleanup command. But for people haven't impacted, I'd say we should do something like mv /tmp /data, but I wonder whether this would break things.

aldy505 avatar Oct 11 '25 11:10 aldy505

I cleaned all events through "sentry cleanup --days 0", but after few hours trouble with

2025-10-13T13:56:13.172935Z ERROR relay_server::services::projects::source::upstream: error fetching project state e0623607b660cffdc40255cbe0b27f88: deadline exceeded errors=0 pending=268 tags.did_error=false tags.was_pending=true tags.project_key="e0623607b660cffdc40255cbe0b27f88"
2025-10-13T13:56:13.172984Z ERROR relay_server::services::projects::source::upstream: error fetching project state 8b7620f43375fd8f779d094df8bb21af: deadline exceeded errors=0 pending=268 tags.did_error=false tags.was_pending=true tags.project_key="8b7620f43375fd8f779d094df8bb21af"
2025-10-13T13:56:13.173021Z ERROR relay_server::services::projects::cache::service: failed to fetch project from source: Fetch { project_key: ProjectKey("e0623607b660cffdc40255cbe0b27f88"), previous_fetch: None, initiated: Instant { tv_sec: 238877, tv_nsec: 545989812 }, when: Some(Instant { tv_sec: 260040, tv_nsec: 356032497 }), revision: Revision(None) } tags.project_key="e0623607b660cffdc40255cbe0b27f88" tags.has_revision=false error=upstream error failed to send message to service error.sources=[failed to send message to service]
2025-10-13T13:56:13.173098Z ERROR relay_server::services::projects::cache::service: failed to fetch project from source: Fetch { project_key: ProjectKey("8b7620f43375fd8f779d094df8bb21af"), previous_fetch: None, initiated: Instant { tv_sec: 235272, tv_nsec: 216505594 }, when: Some(Instant { tv_sec: 260040, tv_nsec: 356067942 }), revision: Revision(None) } tags.project_key="8b7620f43375fd8f779d094df8bb21af" tags.has_revision=false error=upstream error failed to send message to service error.sources=[failed to send message to service]
2025-10-13T13:56:24.678817Z ERROR relay_server::services::projects::source::upstream: error fetching project state 5765ea26f611891bdfbdea9a623e173d: deadline exceeded errors=0 pending=264 tags.did_error=false tags.was_pending=true tags.project_key="5765ea26f611891bdfbdea9a623e173d"
2025-10-13T13:56:24.678881Z ERROR relay_server::services::projects::cache::service: failed to fetch project from source: Fetch { project_key: ProjectKey("5765ea26f611891bdfbdea9a623e173d"), previous_fetch: Some(Instant { tv_sec: 240207, tv_nsec: 291652357 }), initiated: Instant { tv_sec: 240507, tv_nsec: 293556409 }, when: Some(Instant { tv_sec: 260051, tv_nsec: 850486538 }), revision: Revision(Some("3f06a71152e043c4be76e150b0141f12")) } tags.project_key="5765ea26f611891bdfbdea9a623e173d" tags.has_revision=true error=upstream error failed to send message to service error.sources=[failed to send message to service]

repeated again and again. Is I know "cache" is a redis service, but it running. Also I checking "kafka" service and it also running without errors. How I can fix this errors?

cyfran avatar Oct 13 '25 13:10 cyfran

@myonlylonely Did you change version to 25.8.0 for all images, or only SENTRY_IMAGE? I've tried everything I can, but the problem with

relay_server::services::projects::source::upstream relay_server::services::projects::cache::service relay_server::services::projects::cache::service

still persist, and I can not resolve it in any possible way.

cyfran avatar Oct 13 '25 15:10 cyfran