Elector interval field value out of range
I was evaluating riverqueue for use in one of my projects and after attempting to run it for the first time, I'm getting the following error:
time=2024-08-21T19:45:12.744Z level=ERROR msg="Elector: Error attempting to elect" client_id=5e589f722171_2024_08_21T19_45_11_805369 err="pq: interval field value out of range: "15000000000"" num_errors=2 sleep_duration=1.917471608s
Postgres logs:
2024-08-21 19:07:10.854 UTC [70] ERROR: interval field value out of range: "15000000000" 2024-08-21 19:07:10.854 UTC [70] CONTEXT: unnamed portal parameter $2 = '...' 2024-08-21 19:07:10.854 UTC [70] STATEMENT: -- name: LeaderAttemptElect :execrows INSERT INTO river_leader(leader_id, elected_at, expires_at) VALUES ($1, now(), now() + $2::interval) ON CONFLICT (name) DO NOTHING
I have validated that my migrations worked correctly.
Postgres v14.6 (I'm locked to this version at the moment) Go v1.23.0 River v0.11.4
Any suggestions?
Seems likely you’re using the dbsql driver? This appears to be a bug with that driver not handling duration types as db interval types correctly.
I tried both, actually. I thought it might have been the dbsql driver at first so I refactored to use riverpgxv5 and it was the same result.
Hmm, you must have at least gotten a different error when using pgx? The message above is specific to pq which isn’t used at all by the pgx driver.
I'll verify this tomorrow morning. It's entirely possible that I'm mixing things in the wrong way.
It occurred to me while looking at this that 15000000000 is the elector's default 15 second interval in nanoseconds, so 15 * 1_000_000_000.
You can see here that when pgx encodes a duration, it encodes it as microseconds, which is the maximum precision that Postgres will accept:
https://github.com/jackc/pgx/blob/4f7e19d67df4d411ac1c19373390021a2f23aa07/pgtype/builtin_wrappers.go#L498-L514
For some reason your pgx is trying to write nanoseconds instead of microseconds. Do you have a custom encoding override somewhere?
So it turns out I hadn't had enough coffee when I was testing this. I had set things up twice in different packages, and had forgotten to update the consumer's connection. Once I updated the consumer to correctly use the riverpgxv5 connector everything is working.
Thanks for your quick responses, RiverQueue is a very clean queue system to configure and use.
@brandur hmm, wondering if there still might be a dbsql-specific issue here given that the "fix" was to use the pgx driver 🤔
It may be that River doesn't function when using lib/pq. I'm going to wait and see if we get anymore bug reports about that before actioning though — it's going to be a yak shave, and lib/pq's been deprecated for so long that I suspect a lot of people have since dropped it.
Hi @brandur, I have the same error when using riverdatabasesql driver and unfortunately I am not in a position to move to pgx/v5 immediately.
pq: interval field value out of range
It should be a simple fix to sqlc.yaml code based on this comment - https://github.com/sqlc-dev/sqlc/issues/429#issuecomment-607930265
- db_type: "pg_catalog.interval"
go_type: "github.com/jackc/pgx/v5/pgtype.Interval"
Would you be willing to accept a PR for this?
@varunbpatil yes, PRs welcome. This is something we’ll also want to add test coverage for in the shared driver test suite to make sure it stays working.
I kinda suspect this is going to be a big yak shave, but yeah it'd be nice to get it working if possible.
@brandur you're right. I did get the driver tests to pass with the riverdatabasesql driver and lib/pq backend, but not without some sqlc datatype modifications. I suspect these modifications won't work with the riverdatabasesql + pgx/v5 backend, but I'm okay with that for my use-case.
Is it safe to assume that if the driver tests pass, most of the common functionality is working correctly?
Is it safe to assume that if the driver tests pass, most of the common functionality is working correctly?
Yep, pretty much. There are a couple exceptions that we know of like that pgx's simple mode isn't covered correctly, but it's okay because we don't yet support it.
I'm getting the same error using riverdatabasesql and a wrapped pgx/v5:
failed to encode args[1]: unable to encode 15000000000 into binary format for interval (OID 1186): cannot find encode plan
is this the same problem?
I've checked our wrapper (adds otel and prometheus support) and it does not appear that we do anything with Interval types. What's the best way to debug where this might be going wrong? Is the best thing to breakpoint at attemptElectOrReelect and step through from there?
Thanks for your time 🙏
EDIT: For anyone who needs to know, I fixed this by implementing the NamedValueChecker interface in our db wrapper and calling down to the wrapped driver's CheckNamedValue method. In our case this was https://github.com/XSAM/otelsql
Hello @brandur moving to pgx is a TODO for us, but currently stuck with database/sql so I wrote a workaround for the riverdatabasesql to cast the time.Duration to psql interval of seconds.
Should I open a PR for this?
Responded over on #882.
Opened: https://github.com/riverqueue/river/pull/883
Check out v0.22.0 which should fix this problem.
https://github.com/riverqueue/river/releases/tag/v0.22.0