lemmy icon indicating copy to clipboard operation
lemmy copied to clipboard

Make database tests run in parallel and not fail because of other failed tests

Open dullbananas opened this issue 1 year ago • 23 comments

  • Create a new build_db_pool_for_tests function which:
    • Builds the pool in a LazyCell which runs migrations and maybe inserts stuff in the initialization function
    • Gets a pooled connection, calls the AsyncConnection::begin_test_transaction method, and returns it by wrapping it in DbPool::Conn
  • Remove the serial attribute from tests, since concurrent transactions don't see each other's effects

dullbananas avatar Aug 15 '24 22:08 dullbananas

Could be faster, but also probably not worth the effort, as the tests don't take that long. Federation tests are much slower anyway.

dessalines avatar Sep 10 '24 17:09 dessalines

Test failures causing some subsequent tests to fail is a big enough inconvenience in my opinion

dullbananas avatar Sep 11 '24 04:09 dullbananas

Here are two blog posts related to Postgres testing. The first one also mentions some problems that happen when using migrations for tests.

https://blog.dogac.dev/pg-test-table-track/

https://calpaterson.com/against-database-teardown.html

And some discussion about these posts: https://lobste.rs/s/84ysx5/start_with_clean_slate_integration

Also this tool: https://github.com/boustrophedon/pgtemp

Nutomic avatar Apr 23 '25 13:04 Nutomic

A full CI run on main branch currently takes 49 minutes. Of that 21 min are unit tests, 11 min are clippy, federation tests only 7 min.

I actually wrote something like this to run tests in parallel for Ibis. It uses AtomicI32 to allow a limited number of parallel tests. And it generates a new temporary db for each test case. This code can be adapted for Lemmy, with multiple temp dbs which get reused between tests.

https://github.com/Nutomic/ibis/blob/master/crates/backend/tests/common.rs

Nutomic avatar Jun 05 '25 09:06 Nutomic

pgtemp looks like the best option to me, along with removing serial for as many of these as possible.

It'd still need to run the migrations for each test (or within each file), but at least everything could be done in parallel, esp with cargo test --test-threads=X

dessalines avatar Jun 05 '25 16:06 dessalines

Running all migrations again for each test is not what's currently done, and it would probably be way too slow. I think the begin_test_transaction solution is best. Per-test databases like in Ibis can be used for the very small number of tests that need it (the only one I know of is the migrations test).

dullbananas avatar Jun 05 '25 22:06 dullbananas

Yes it needs some adjustments but you can see the general idea. For Lemmy it should work like this:

  • Create single cluster with pg_ctl init
  • Create multiple databases with CREATE DATABASE name; and run migrations
  • Put these into deadpool
  • In test-context setup method, retrieve a database from the pool
  • Now run the test (having exclusive access to the current db)
  • In teardown method release db back to pool

Nutomic avatar Jun 06 '25 10:06 Nutomic

If you can get is working with transactions, then that'd be best. Then you don't have to deal with multiple DBs and teardowns.

dessalines avatar Jun 07 '25 00:06 dessalines

The post I linked above describes why transactions dont work well for tests. Having multiple DBs is not complicated, and we are already doing teardowns now.

Nutomic avatar Jun 10 '25 10:06 Nutomic

Will look at the issue this week

momentary-lapse avatar Jun 16 '25 00:06 momentary-lapse

Found a crate which basically implements an approach mentioned by @Nutomic, and here's an example of usage with diesel. Do you see any pitfalls of using it in lemmy code?

momentary-lapse avatar Jun 20 '25 00:06 momentary-lapse

Nice, looks legit. Its worth a try.

There's also https://github.com/boustrophedon/pgtemp if that doesn't work.

dessalines avatar Jun 20 '25 00:06 dessalines

We should try the db-pool crate.

For maintainability, the below code that each test currently has should not be replaced with direct use of the external crate. Instead, there should be a new wrapper in lemmy_db_schema::utils.

let pool = &build_db_pool_for_tests();
let pool = &mut pool.into();

If tests remain too slow, then we can make most tests share one of the pooled databases and use AsyncConnection::begin_test_transaction.

dullbananas avatar Jun 21 '25 00:06 dullbananas

Yeah, I see. build_db_pool_for_tests method is currently a dummy method using a regular pool, but it should build its own. I'm currently looking how we can adopt the crate—unfortunately, it supports only bb8 pool. There's code for deadpool feature as well but it's not listed in supported pools, and commented in mod.rs. I cloned the repo and would like to play with the code a bit, maybe it's not so hard to adapt it. Also asked the owner about that, hope they answer me

momentary-lapse avatar Jun 21 '25 01:06 momentary-lapse

UPD. I contacted the owner of the repo, and he said he'll be able to finish the deadpool support this weekend (offered him assistance—let's see how it goes). I think that would be perfect, because I really like how easy is to set up the pooling with that crate

momentary-lapse avatar Jun 21 '25 12:06 momentary-lapse

db-pool now supports deadpool for postgres, but when i try using it, it says

error: failed to download `db-pool v0.5.0`

Caused by:
  unable to get packages from source

Caused by:
  failed to parse manifest at `/home/aboitsov/.cargo/registry/src/index.crates.io-6f17d22bba15001f/db-pool-0.5.0/Cargo.toml`

Caused by:
  feature `edition2024` is required

  The package requires the Cargo feature called `edition2024`, but that feature is not stabilized in this version of Cargo (1.81.0 (2dbb1af80 2024-08-20)).
  Consider trying a newer version of Cargo (this may require the nightly release).
  See https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#edition-2024 for more information about the status of this feature.

Are we currently locked on the 1.81.0 version of rust/cargo?

momentary-lapse avatar Jun 28 '25 16:06 momentary-lapse

Dang. Unfortunately yes, due to https://github.com/rust-lang/rust/issues/132064 , https://github.com/rust-lang/rust/issues/141006 .

I'm sure those issues will get fixed eventually, and we won't be stuck on 1.81 forever. I'd say for now go ahead and update the edition, and continue with the PR, and we can decide later if we want to leave it as a pending PR until rust fixes the issues with diesel (or just merge, be on the latest rust, but deal with slow compile times).

dessalines avatar Jun 28 '25 16:06 dessalines

Keep in mind that some tests currently call build_db_pool instead of build_db_pool_for_tests, so those need to be changed too.

dullbananas avatar Jun 29 '25 02:06 dullbananas

You can also ask the db-pool maintainer to lower the minimum Rust version. With stable Rust even incremental builds will be extremely slow (5 minutes or more) which makes development very annoying, and erases all the speedup from parallelizing.

Nutomic avatar Jun 30 '25 07:06 Nutomic

Yeah, good point. I ask or prepare a pr for that. By the way, I stumbled upon another problem integrating the library. Here you have to define a closure preparing the db: https://github.com/yasamoka/db-pool/blob/main/examples%2Fdiesel_async_postgres_deadpool.rs#L37-L48. In our case, we should run migrations. But the problem is that our migrations are sync, and this code requires async connection to be passed and returned. Diesel async wrapper is also not a solution, since it takes ownership over the connection, and you must return it. I'm thinking about making another sync connection right inside the closure and do migrations there, but don't yet understand if it's possible to get db url from existing AsyncPgConnection. Asked diesel devs, but maybe you know?

momentary-lapse avatar Jun 30 '25 08:06 momentary-lapse

Best open a PR with what you have so far, then we can have a look directly.

Nutomic avatar Jul 01 '25 07:07 Nutomic

I explained the issue to the db-pool owner and he takes a look at his initialization function API, because providing async connection doesn't seem work in diesel case at least. I might do it in fork until then, because now it's impossible to adapt the existing code. Also, we anyway gonna need a fork because of 1.81

Here's my dialogue with him:

https://github.com/yasamoka/db-pool/discussions/3

momentary-lapse avatar Jul 01 '25 08:07 momentary-lapse

I made this draft to share some thoughts: #5846 Please take a look when you have a spare time

momentary-lapse avatar Jul 04 '25 15:07 momentary-lapse