502 Bad Gateway on most requests
gonic version: nightly docker tag: nightly
(Although I'm running on nightly at the moment this also happens on the most recent release)
As of about a week ago, I've started seeing 502 errors on almost all requests to my gonic instance.
Here's the logs just from GETting /home while logged in:
2025/10/24 14:20:01 response 200 GET /home
2025/10/24 14:20:12 error writing to response buffer: write tcp 172.17.0.9:4747->172.17.0.1:47186: i/o timeout
2025/10/24 14:20:12 response 200 GET /home
2025/10/24 14:20:20 error writing to response buffer: write tcp 172.17.0.9:4747->172.17.0.1:43696: i/o timeout
2025/10/24 14:20:20 response 200 GET /home
Even the subsonic API has similar issues:
2025/10/24 14:21:41 response 200 GET /getAvatar?u=guest&s=REDACTED&v=1.2.0&c=DSub&username=guest
2025/10/24 14:21:41 response 200 GET /getUser?u=guest&s=REDACTED&t=REDACTED&v=1.2.0&c=DSub&username=guest
2025/10/24 14:21:41 response 200 GET /getBookmarks?u=guest&s=REDACTED&t=REDACTED&v=1.2.0&c=DSub
2025/10/24 14:21:59 error writing subsonic response: write tcp 172.17.0.9:4747->172.17.0.1:55588: i/o timeout
2025/10/24 14:21:59 response 200 GET /getAlbumList2?u=guest&s=REDACTED&t=REDACTED&v=1.2.0&c=DSub&type=newest&size=20&offset=0
2025/10/24 14:22:00 response 200 GET /getCoverArt?u=guest&s=REDACTED&t=REDACTED&v=1.2.0&c=DSub&id=al-45762
2025/10/24 14:22:00 response 200 GET /getCoverArt?u=guest&s=REDACTED&t=REDACTED&v=1.2.0&c=DSub&id=al-46161
My only guess at the issue is that I'm hitting some sort of limit due to the size of my library (290517 tracks, ~29815 albums according to beet stats).
hi @djl looks like we have some slow queries, can you please pull the latest :nightly and enable the log-db setting? then load some pages and it should log what the slow queries are
Here's the output from a request to /home
gorm sql/src/podcast/podcast.go:4792.359605msSELECT * FROM "podcast_episodes" WHERE (status=?) ORDER BY updated_at DESC[downloading] 0
gorm sql/go/pkg/mod/github.com/sentriz/[email protected]/gormstore.go:1291.609375msSELECT * FROM "sessions" WHERE (id = ? AND expires_at > ?) ORDER BY "sessions"."id" ASC LIMIT 1[SNIP SNIP] 1
gorm sql/src/db/db.go:1031.425327msSELECT * FROM "users" WHERE (id=?) ORDER BY "users"."id" ASC LIMIT 1[1] 1
gorm sql/src/db/db.go:89375.619µsSELECT count(*) FROM "albums" [] 0
gorm sql/src/db/db.go:901.033442msSELECT count(*) FROM ( SELECT count(*) as name FROM "album_artists" GROUP BY album_id ) AS count_table[] 0
gorm sql/src/db/db.go:911.080507msSELECT count(*) FROM ( SELECT count(*) as name FROM "track_artists" GROUP BY artist_id ) AS count_table[] 0
gorm sql/src/podcast/podcast.go:479529.553679msSELECT * FROM "podcast_episodes" WHERE (status=?) ORDER BY updated_at DESC[downloading] 0
gorm sql/src/db/db.go:923.265451msSELECT count(*) FROM ( SELECT count(*) as name FROM "album_artists" GROUP BY artist_id ) AS count_table[] 0
gorm sql/src/db/db.go:93446.991µsSELECT count(*) FROM "tracks" [] 0
gorm sql/src/db/db.go:94401.321µsSELECT count(*) FROM "internet_radio_stations" [] 0
gorm sql/src/db/db.go:95331.342µsSELECT count(*) FROM "podcasts" [] 0
gorm sql/src/db/db.go:160974.829µsSELECT * FROM "settings" WHERE (key=?) ORDER BY "settings"."key" ASC LIMIT 1[lastfm_api_key] 0
gorm sql/src/server/ctrladmin/handlers.go:501.218879msSELECT * FROM "users" [] 2
gorm sql/src/server/ctrladmin/handlers.go:562.879930671sSELECT * FROM "albums" ORDER BY created_at DESC LIMIT 10[] 10
gorm sql/src/db/db.go:1601.530669msSELECT * FROM "settings" WHERE (key=?) ORDER BY "settings"."key" ASC LIMIT 1[last_scan_time] 1
gorm sql/src/server/ctrladmin/handlers.go:671.14396msSELECT * FROM "transcode_preferences" WHERE (user_id=?)[1] 0
gorm sql/src/server/ctrladmin/handlers.go:731.278219msSELECT * FROM "podcasts" [] 0
gorm sql/src/server/ctrladmin/handlers.go:761.100552msSELECT * FROM "internet_radio_stations" [] 0
gorm sql/go/pkg/mod/github.com/sentriz/[email protected]/gormstore.go:1771.949776msINSERT INTO "sessions" ("id","data","created_at","updated_at","expires_at") VALUES (?,?,?,?,?)[SNIP] 1
gorm sql/go/pkg/mod/github.com/sentriz/[email protected]/gormstore.go:1851.364749msUPDATE "sessions" SET "data" = ?, "created_at" = ?, "updated_at" = ?, "expires_at" = ? WHERE "sessions"."id" = ?[SNIP] 1
2025/10/24 17:58:44 error writing to response buffer: write tcp 172.17.0.9:4747->172.17.0.1:56198: i/o timeout
2025/10/24 17:58:44 response 200 GET /home
gorm sql/go/pkg/mod/github.com/sentriz/[email protected]/gormstore.go:1292.785177msSELECT * FROM "sessions" WHERE (id = ? AND expires_at > ?) ORDER BY "sessions"."id" ASC LIMIT 1[SNIP] 1
gorm sql/src/db/db.go:1032.574099msSELECT * FROM "users" WHERE (id=?) ORDER BY "users"."id" ASC LIMIT 1[1] 1
gorm sql/src/db/db.go:89594.532µsSELECT count(*) FROM "albums" [] 0
gorm sql/src/db/db.go:901.110561msSELECT count(*) FROM ( SELECT count(*) as name FROM "album_artists" GROUP BY album_id ) AS count_table[] 0
gorm sql/src/podcast/podcast.go:47968.17317msSELECT * FROM "podcast_episodes" WHERE (status=?) ORDER BY updated_at DESC[downloading] 0
gorm sql/src/db/db.go:913.376755msSELECT count(*) FROM ( SELECT count(*) as name FROM "track_artists" GROUP BY artist_id ) AS count_table[] 0
gorm sql/src/db/db.go:921.173593msSELECT count(*) FROM ( SELECT count(*) as name FROM "album_artists" GROUP BY artist_id ) AS count_table[] 0
gorm sql/src/podcast/podcast.go:47914.165208msSELECT * FROM "podcast_episodes" WHERE (status=?) ORDER BY updated_at DESC[downloading] 0
gorm sql/src/db/db.go:932.502432msSELECT count(*) FROM "tracks" [] 0
gorm sql/src/db/db.go:94377.771µsSELECT count(*) FROM "internet_radio_stations" [] 0
gorm sql/src/db/db.go:95332.619µsSELECT count(*) FROM "podcasts" [] 0
gorm sql/src/db/db.go:160918.521µsSELECT * FROM "settings" WHERE (key=?) ORDER BY "settings"."key" ASC LIMIT 1[lastfm_api_key] 0
gorm sql/src/server/ctrladmin/handlers.go:501.30116msSELECT * FROM "users" [] 2
gorm sql/src/server/ctrladmin/handlers.go:562.84118132sSELECT * FROM "albums" ORDER BY created_at DESC LIMIT 10[] 10
gorm sql/src/db/db.go:1601.186326msSELECT * FROM "settings" WHERE (key=?) ORDER BY "settings"."key" ASC LIMIT 1[last_scan_time] 1
gorm sql/src/server/ctrladmin/handlers.go:67965.662µsSELECT * FROM "transcode_preferences" WHERE (user_id=?)[1] 0
gorm sql/src/server/ctrladmin/handlers.go:731.102752msSELECT * FROM "podcasts" [] 0
gorm sql/src/server/ctrladmin/handlers.go:76713.059µsSELECT * FROM "internet_radio_stations" [] 0
gorm sql/go/pkg/mod/github.com/sentriz/[email protected]/gormstore.go:1771.427798msINSERT INTO "sessions" ("id","data","created_at","updated_at","expires_at") VALUES (?,?,?,?,?)[SNIP] 1
gorm sql/go/pkg/mod/github.com/sentriz/[email protected]/gormstore.go:1852.496901msUPDATE "sessions" SET "data" = ?, "created_at" = ?, "updated_at" = ?, "expires_at" = ? WHERE "sessions"."id" = ?[SNIP] 1
2025/10/24 17:58:53 error writing to response buffer: write tcp 172.17.0.9:4747->172.17.0.1:48876: i/o timeout
2025/10/24 17:58:53 response 200 GET /home
And one from the subsonic API ("Recently added" in DSub)
gorm sql/src/db/db.go:1151.853255msSELECT * FROM "users" WHERE (name=?) ORDER BY "users"."id" ASC LIMIT 1[guest] 1
gorm sql/src/server/ctrlsubsonic/handlers_by_tags.go:20515.980158005sSELECT albums.*, count(tracks.id) child_count, sum(tracks.length) duration FROM "albums" LEFT JOIN tracks ON tracks.album_id=albums.id JOIN album_artists ON album_artists.album_id=albums.id GROUP BY albums.id ORDER BY created_at DESC LIMIT 20 OFFSET 0[] 20
gorm sql/src/podcast/podcast.go:47913.144556797sSELECT * FROM "podcast_episodes" WHERE (status=?) ORDER BY updated_at DESC[downloading] 0
gorm sql/src/server/ctrlsubsonic/handlers_by_tags.go:2057.372908msSELECT * FROM "artists" INNER JOIN "album_artists" ON "album_artists"."artist_id" = "artists"."id" WHERE ("album_artists"."album_id" IN (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?))[89982 89962 89973 65378 89986 89988 89950 89969 65016 89991 89992 89989 89984 89978 89955 89974 66734 89976 89959 89987] 0
gorm sql/src/podcast/podcast.go:47910.163381msSELECT * FROM "podcast_episodes" WHERE (status=?) ORDER BY updated_at DESC[downloading] 0
gorm sql/src/server/ctrlsubsonic/handlers_by_tags.go:2054.29769msSELECT * FROM "genres" INNER JOIN "album_genres" ON "album_genres"."genre_id" = "genres"."id" WHERE ("album_genres"."album_id" IN (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?))[89973 89989 89992 89982 66734 89976 65016 65378 89987 89959 89986 89978 89969 89988 89984 89991 89962 89950 89955 89974] 0
gorm sql/src/server/ctrlsubsonic/handlers_by_tags.go:2052.452666msSELECT * FROM "album_stars" WHERE ("album_id" IN (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)) AND (user_id=?)[89988 89962 89950 89973 89969 89989 89986 89991 89982 89974 89959 65378 89987 89978 66734 65016 89984 89992 89955 89976 2] 0
gorm sql/src/server/ctrlsubsonic/handlers_by_tags.go:2052.581503msSELECT * FROM "album_ratings" WHERE ("album_id" IN (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)) AND (user_id=?)[89974 65378 89987 89955 89969 65016 89959 89986 89992 89973 66734 89976 89988 89978 89950 89989 89984 89991 89982 89962 2] 0
gorm sql/src/server/ctrlsubsonic/handlers_by_tags.go:2052.185253msSELECT * FROM "plays" WHERE ("album_id" IN (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)) AND (user_id=?)[89992 89959 89989 89986 89988 89984 89987 89978 89962 89955 89991 89982 89950 89974 89973 66734 89976 65016 89969 65378 2] 0
2025/10/24 18:00:28 error writing subsonic response: write tcp 172.17.0.9:4747->172.17.0.1:50178: i/o timeout
2025/10/24 18:00:28 response 200 GET /getAlbumList2?u=guest&s=SNIP&t=SNIP&v=1.2.0&c=DSub&type=newest&size=20&offset=0
Finally had some time to start debugging this a little bit.
So this query from handlers_by_tags.go:
gorm sql/src/server/ctrlsubsonic/handlers_by_tags.go:20515.980158005sSELECT albums.*, count(tracks.id) child_count, sum(tracks.length) duration FROM "albums" LEFT JOIN tracks ON tracks.album_id=albums.id JOIN album_artists ON album_artists.album_id=albums.id GROUP BY albums.id ORDER BY created_at DESC LIMIT 20 OFFSET 0[] 20
If I'm reading this right, it's taking almost 16 seconds. If I run the query manually it only takes a few milliseconds and still returns the correct data:
time sqlite3 gonic.db 'SELECT albums.*, count(tracks.id) child_count, sum(tracks.length) duration FROM "albums" LEFT JOIN tracks ON tracks.album_id=albums.id JOIN album_artists ON album_artists.album_id=albums.id GROUP BY albums.id ORDER BY created_at DESC LIMIT 20'
... SNIP ...
real 0m0.129s
user 0m0.113s
sys 0m0.016s
Some sort of concurrency issue, maybe?
A little bit more debugging from this morning.
If I comment out the c.db.Stats() call in ServeHome then the home page will load in a reasonable time (about four seconds). However, if I then attempt two concurrent requests to the homepage (with different users/clients) then the timeout is back.
Given that, I don't think this is a slow query issue after all. Especially since the queries, when run manually, all return very quickly.
Weird!
@djl can you try again on the latest :nightly ?
No dice, sadly, but this did give me motivation to start debugging again.
ServeHome still chokes when c.db.Stats() is called. I haven't been able to narrow down exactly which query is causing the problem yet.
ServeGetAlbumListTwo seems to be slow when the type parameter is newest. Ordering by albums.id instead of created_at makes the query near instant again. Ordering by a datetime column is apparently very slow?
@djl thanks, i pushed an index. :nightly is building
@sentriz the index does not seem to have helped any. Timeouts still happen in the rest API.
I still think there's an odd concurrency/locking issue somewhere. The queries, when run manually, are very fast:
$ time (sqlite3 gonic.db "SELECT albums.*, count(tracks.id) child_count, sum(tracks.length) duration FROM "albums" LEFT JOIN tracks ON tracks.album_id=albums.id JOIN album_artists ON album_artists.album_id=albums.id GROUP BY albums.id ORDER BY created_at DESC LIMIT 20 OFFSET 0" > /dev/null)
real 0m0.135s
user 0m0.123s
sys 0m0.012s
Even when /rest/ calls are hanging, the manually run queries are still near-instant.
@djl was this an issue on v0.18.0? ie did it only become an issue since v0.19.0?
@sentriz You may be on to something. I've just pulled the v0.18.0 tag and have started a rescan. No issues so far, but I'm only about 20% of the way through my library. Will report back when it's done.
@sentriz It wasn't an issue with v0.19.0 but one specific tag is where the issues began: 8e7820a. The previous tag (3909e9b) is the last working tag.
ahhhh thanks @djl that pointed my attention to another issue. WAL mode was not being enabled for the Wasm driver. can check again :nightly when it's built
Checked both the Docker image and a local build - still an issue :(
@djl oh no, and that was definitely commit b67c1c1?
@sentriz Definitely the right commit:
$ git rev-parse HEAD
b67c1c14d2b883f634f7aaecc0367a15ec7dab71
I think I've narrowed it down to a single commit: https://github.com/sentriz/gonic/commit/7802e20e7214e1912fc84bb29c877984b99913c9. This is the first commit where the issue appears. Is this an issue with removing the Cgo stuff?
the database changes are in 505f7d8 which comes right after 7802e20. could you test specifically between those two commits to confirm? thanks for all your help on this!
https://github.com/sentriz/gonic/commit/505f7d8ae6116c5dc1cfa50a5caba6333672be20 doesn't even build for me:
$ go build -o gonic ./cmd/gonic/.
go: go.senan.xyz/[email protected]: missing go.sum entry for go.mod file; to add it:
go mod download go.senan.xyz/wrtag
$ go mod tidy
$ go build -o gonic ./cmd/gonic/.
# go.senan.xyz/gonic/tags/wrtag
tags/wrtag/wrtag.go:33:14: undefined: wrtag.Properties
but that seems to be fixed in the next commit which does build and has the issue.
@djl i just pushed one more commit ~ca30762~, WAL mode setting was not being shared between connections
EDIT , sorry new commit is 469649935e26ef279a4a805cd964d3b010aa8146
No luck with https://github.com/sentriz/gonic/commit/469649935e26ef279a4a805cd964d3b010aa8146 from nightly :(
ok, and are you sure 85ea8d8 does not have the issue? that's the earliest that makes sense to me would actually break stuff
85ea8d896ab0d147fc8406a4173cec27b0693ce2 is definitely fine. 58c2933355f30301a716acb6e3160eb6306cd1a8 is where the problem starts.
I tried bypassing the reverse proxy, just in case it was an issue there. It is not.
$ curl "http://REDACTED:4747/rest/getAlbumList2?...
curl: (52) Empty reply from server
curl 0.00s user 0.00s system 0% cpu 15.465 total
@djl ok thank you. last option, maybe you could send my a copy of your database? (with passwords inside redacted?) then i can try reproduce it