Shutdown (without persist) hangs in Github actions
I can't repro this locally, so I don't know what the source of the issue is. The temporary DB, created with PgTempDB::async_new().await (so persist isn't set), hangs for minutes after a call to shutdown. Maybe there are some existing connections to the DB, but according to the documentation, this shouldn't block shutdown when persist isn't set.
This happens on a runner running Ubuntu noble, with postgresql-16 installed.
Could you post a code sample? I briefly looked at the docs and it says "SIGKILL kills the postgres process without letting it relay the signal to its subprocesses, so it might be necessary to kill the individual subprocesses by hand as well." which I didn't know - maybe in my testing I only ever tried to kill it before doing work and the subprocesses were never spawned.
This sample completes ok locally, but in my Github Actions it hangs indefinitely after printing "closing temp DB"
#[tokio::test]
async fn pgtemp_test() {
use pgtemp::PgTempDB;
use sqlx::postgres::PgPoolOptions;
let temp_db = PgTempDB::async_new().await;
println!("got temp DB");
let address = format!(
"postgresql://{}:{}@localhost:{}",
temp_db.db_user(),
temp_db.db_pass(),
temp_db.db_port()
);
let pool = PgPoolOptions::new()
.connect(&address)
.await
.expect("Can't connect to postgres");
let conn = pool.acquire().await.unwrap();
drop(conn);
println!("starting teardown");
pool.close().await;
println!("pool2: size: {} idle: {}", pool.size(), pool.num_idle());
drop(pool);
println!("closing temp DB");
drop(temp_db);
println!("teardown complete");
}
Per #10 the issue is I was dumb with how I designed the shutdown function - probably what happened was that I wrote shutdown as fn shutdown(mut self) but then I wanted to use it inside drop, which is implemented with fn drop(&mut self) so I thought "oh I'll just change the signature of shutdown". However, it's obvious in hindsight that shutdown will then be called twice.
There are a lot of options here to fix this and pretty much all of them are breaking changes (which tbh isn't that big a deal since it's not like I have a ton of users). Probably simplest solution (and also doesn't change the API) is the one you provided but give me a bit to think about it.
No hurry, but the PR is a separate issue - I see in github that shutdown never completes on the first call.
On Thu, 11 Jul 2024, 0:20 Harry Stern, @.***> wrote:
Per #10 https://github.com/boustrophedon/pgtemp/pull/10 the issue is I was dumb with how I designed the shutdown function - probably what happened was that I wrote shutdown as fn shutdown(mut self) but then I wanted to use it inside drop, which is implemented with fn drop(&mut self) so I thought "oh I'll just change the signature of shutdown". However, it's obvious in hindsight that shutdown will then be called twice.
There are a lot of options here to fix this and pretty much all of them are breaking changes (which tbh isn't that big a deal since it's not like I have a ton of users). Probably simplest solution (and also doesn't change the API) is the one you provided but give me a bit to think about it.
— Reply to this email directly, view it on GitHub https://github.com/boustrophedon/pgtemp/issues/9#issuecomment-2221503424, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDEBBRYVD3XG7ABXU4BBWLZLWQSJAVCNFSM6AAAAABKSVYIEOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRRGUYDGNBSGQ . You are receiving this because you authored the thread.Message ID: @.***>
So the PR doesn't solve the hanging issue?
nope, it solves a different issue.
Taking another look at this I wonder if the older version of pg in CI is different than what we have locally and maybe the behavior on sigkill is different, and a newer pg does better cleanup?
I can't get this to race locally with 1 or 100 connections.
I don't think it's a version mismatch - checked this with CI & local versions both 16.3. It could be an OS difference - CI is Ubuntu, local is MacOS.
I'm also on linux with 16.3. If you run the test in release mode does it hang in CI still?
Yup.
On Sun, 18 Aug 2024, 2:17 Harry Stern, @.***> wrote:
I'm also on linux with 16.3. If you run the test in release mode does it hang in CI still?
— Reply to this email directly, view it on GitHub https://github.com/boustrophedon/pgtemp/issues/9#issuecomment-2295023315, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDEBBTDS5JIQ4MVSKBW5CDZR7KZ7AVCNFSM6AAAAABKSVYIEOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJVGAZDGMZRGU . You are receiving this because you authored the thread.Message ID: @.***>
The newest release, v0.5.0, uses fast shutdown instead of sending SIGKILL. Could you check and see if this issue still occurs with the newest version?
Did you get this to work?
No, I tried it once, the issue seemed to have persisted, but I didn't have the time to investigate. Since then I've moved to another team, so I probably won't have another opportunity to test this, so @boustrophedon I guess the issue can be closed.
Thanks @nihohit for the quick reply. I'm receiving the same error as you originally posted, so I don't think the issue should be closed.
@boustrophedon Have you been able to get it working in Github actions?
Update: Issue persists if I change to PgTempDB::new() instead of PgTempDB::async_new().
@jacobsvante This is still happening on pgtemp 0.5?
Yes, it appears so.
Do you have any integration tests?
Yes, they're in the tests/ directory. I just added the sample test from this issue so we'll see what happens: https://github.com/boustrophedon/pgtemp/actions/runs/13167858114
Running tests/shutdown.rs (target/debug/deps/shutdown-9f5c6b24a836d5ff)running 1 test test pgtemp_shutdown_test ... ok
The sample provided passed in CI (and then the build failed due to a linting error in the python client tests)
The issue was that I had not read the requirements section of the readme, sorry a bit much on my plate lately...
I had forgot to add these lines:
- name: Install postgres
run: sudo apt-get install postgresql postgresql-client
- name: Update path
run: find /usr/lib/postgresql/ -type d -name "bin" >> $GITHUB_PATH
I'll see if I can find some time to add a check for this when running pgtest. Would make it less likely for stressed people like me being confused about why the tests are running for a long time and then just fail with a vague error message 😅
What is causing the issue btw?
I don't know what would have caused your tests to hang - If I remove the postgres install and PATH lines, the tests simply error out. https://github.com/boustrophedon/pgtemp/actions/runs/13188950120/job/36817698923
Strange. Maybe it was two tests running simultaneously. I'll see if I can find what was causing it.