odyssey
odyssey copied to clipboard
actual master from 29.01.2022 broke up auth_query functional to compare with 1.2 release
Hi there!
I had tried to dockerized odyssey but found out some random segfault under load (2k tps) at 1.2 release. pgbench works great on 1.2 release. To reproduce it on actual master I had made docker files like this:
Dockerfile_centos7.txt
Dockerfile_ubuntu.txt
entrypoint.sh.txt
and got this error by pgbench:
There are no more auth_query functional or configs have no backward compatibility?
please add pool_routing "internal"
to auth route.
database "auth" {
user "auth" {
authentication "none"
storage "upstream"
storage_db "postgres"
storage_user "postgres"
storage_password "postgres"
pool_routing "internal"
pool "transaction"
pool_size 3
pool_timeout 0
pool_ttl 30
pool_discard no
pool_cancel no
pool_rollback no
}
}
Maybe we need to dynamicaly create internal route for auth queries if it is not configured by user, or a least fix a doc about auth query...
Thanks, @reshke . It works. But current master still throws segfault. Is there any instruction for core dump my container?
P.S.
You can reproduce it with toward configs:
PGPASSWORD=secret pgbench -h odyssey-proxy -p 5432 -U nonpostgres -c 300 -j 100 -t 1000 pgbench
ofc auth checker
and nonpostgres
is different users
P.S.2. last line is actual master from 29.01.2022 and others are release 1.2
Thanks, your testcase was helpfull. There was a problems, when number of simultaneous auth queries is much bigger then (auth route) pool size. This should be fixed with https://github.com/yandex/odyssey/pull/413, could you please test it as well? Looks like it should help.
@reshke now it doesnt segfault, but pool_size on auth querys in some cases throws failed to make auth_query errors in logs like in screenshot. When i disable it with pool_size = 0 its works well. It much better then topic error, but still an issue with auth_query. Should i create new one?
to reproduce it:
- run docker container with current master from Dockerfile_ubuntu and entrypoint.sh from towards message
- set low pool_size to let auth_query pool size count much less then future pgbench
- run pgbench with high connections (-c)
- profit
When i disable it with pool_size = 0 its works well.
It seems to me that in this case everything works as expected, doesn't it? pool_size
regulates the number of server connections towards auth_db
that can be taken simultaneously. With a large number of parallel authentications (pool_size << number of client trying to auth), there will be errors any way i think
@reshke Nope, the feature has a different description:
https://github.com/yandex/odyssey/blob/master/documentation/configuration.md#pool_size-integer
Clients are put in a wait queue, when all servers are busy.
and:
pool_timeout integer
Server pool wait timeout.
Time to wait in milliseconds for an available server. Disconnect client on timeout reach.
Set to zero to disable.
i do pool_timeout = 0 and expects that everything works as i noted :)
Btw, this feature works as expected at my main storage, but not for auth
Seems having the same with 1.2 -> 1.3 transition:
ERROR: odyssey: c9af96b00a8d9: failed to make auth query (SQLSTATE 28000)
database default {
user default {
authentication "md5"
auth_query "SELECT username, password FROM odyssey.get_auth($1)"
auth_query_db "odyssey"
auth_query_user "odyssey"
password "md5xxx"
storage "postgresql"
pool "session"
pool_size 0
pool_timeout 0
pool_ttl 60
pool_cancel no
pool_rollback yes
client_fwd_error no
}
}
Having the same issue when migrated from 1.1 -> 1.3:
# ADMIN CONSOLE
storage "local" {
type "local"
}
database "console" {
user default {
authentication "none"
pool "session"
storage "local"
}
}
# DEFAULT CONNECTION
storage "postgres_server" {
type "remote"
host "xxx.xxx.xxx.xxx"
port 5432
tls "disable"
}
database default {
user default {
authentication "md5"
auth_query "SELECT usename, passwd FROM pg_shadow WHERE usename='%u'"
auth_query_db "postgres"
auth_query_user "postgres"
storage "postgres_server"
pool "session"
client_fwd_error yes
client_max 400
}
}
unix_socket_dir "/tmp"
unix_socket_mode "0644"
log_format "%p %t %l [%i %s] (%c) %m\n"
log_to_stdout yes
log_debug no
log_config yes
log_session yes
log_query no
log_stats yes
stats_interval 60
listen {
host "*"
port 6432
tls "disable"
}
Having the same issue when migrated from 1.1 -> 1.3:
Same for me, "auth_query" doesn't work at all in 1.3
Same error
Defining database "postgres" with internal routing works for me
` storage "testdb" { type "remote" host "testdb" port 5432 }
database "postgres" {
user "pgbouncer" {
authentication "none"
pool_routing "internal"
storage "testdb"
pool "session"
}
}
database "test" {
user default {
authentication "md5"
auth_query "SELECT uname,phash FROM pgbouncer.user_lookup($1)"
auth_query_user "pgbouncer"
auth_query_db "postgres"
storage "testdb"
...
}
}
`