sea-orm
sea-orm copied to clipboard
sea-orm-cli generate entity PoolTimedOut error with large schema
First off, thanks for all the hard work!
Description
When trying to use sea-orm-cli generate entity
on a large Postgres 9.6.22 database (>400 tables of varying shapes, sizes, constraints, etc.), I encounter the following error:
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: PoolTimedOut', /Users/will.wang/dev-projects/sea-schema/src/postgres/discovery/executor/real.rs:30:55
Similar to: https://github.com/SeaQL/sea-orm/issues/555
Steps to Reproduce
Unfortunately, I can't provide exact details around the schema. I think any sufficiently large database should work. I believe this occurs because the connection times out in between executing the many introspection queries.
Expected Behavior
Entities are automatically generated
Actual Behavior
Connection fails, panic occurs
Reproduces How Often
Always
Versions
sea-orm-cli: 0.8.1
(built locally on master
at https://github.com/SeaQL/sea-schema/commit/00429f3bdbfe56af1afcee94b069a9619cea4263)
postgres: 9.6.22
os: MacOS Monterey 12.3.1
Additional Information
I think the default connect_timeout
of 30 seconds for the underlying sqlx::PoolOptions
is too short. When I changed this locally to 120
seconds, I was able to avoid this issue. It would be great if we could configure a connect_timeout
as an argument in sea-orm-cli generate entity
similar to max_connections
.
Hey @001wwang, thanks for the investigations! Increasing the connect_timeout
seems like a good temporary fix. However, I want to know why it takes so long to fetch the info of all db tables? Does anyone have a large sample schema for testing?
No problem, thanks for the prompt reply! I'll see if I can produce a large sample schema for recreating this issue later.
I think the default
connect_timeout
of 30 seconds for the underlyingsqlx::PoolOptions
is too short. When I changed this locally to120
seconds, I was able to avoid this issue. It would be great if we could configure aconnect_timeout
as an argument insea-orm-cli generate entity
similar tomax_connections
.
How did you solve it? I also have this problem now
How did you solve it? I also have this problem now
Sorry, this issue slipped my mind. Perhaps you could provide more details about your schema to help @billy1624 reproduce this locally.
For my fix, I modified the sea-orm-cli
codebase locally to use a timeout of 120
seconds (then, built and run locally). Disclaimer: it's been a few months since then, so this may not be guaranteed to work nor may be the problem you're facing.
I encountered this error for tables with partitions (postgres). If a table has a lot of partitions, the generator will see those as separate tables and try to generate for all of them. Deleting those partitions solved this problem for me.
This is actually expected and JOOQ also has the same behavior. You can solve this in JOOQ by providing an exclude regex like
<excludes>.*_p[0-9_]+</excludes>
so that it skips those.
In an ideal scenario, you should not have partitions in your target/seed schema but regex option can be added in cli to help these kind of situations in general. In other cases, connect_timeout option will be useful as well.
Hey everyone, I created 3,000 tables on my local Postgres but still not able to reproduce the error. Could someone guide me how to reproduce it?
CREATE TABLE IF NOT EXISTS t_793593555976192 (col1 integer, col2 integer, col3 integer);
... 3,000 create table statements
I'll go ahead and close this issue given the above comments. I haven't looked into this any time recently, and it seems like it might be working at this point. Thanks all!
Just for anyone else facing the issue: This seem to happen intermittently on slow/weak/high packet loss networks. Ensure your network is low latency to make it work.
I am having the same issue with partitioned tables as well. +1 for regex exclusion.
I tried writing a script to just pull them one table at a time and it's still having the same issue. This is against a local docker container.
data=`psql ${DATABASE_URL} -AXqtc "SELECT table_name FROM information_schema.tables WHERE table_schema='public' AND table_type='BASE TABLE';"`
data=$(echo "$data" | rg -v -e "p_hash_p" -e "schema_migrations")
for TABLE in $data
do
echo "$TABLE"
sea-orm-cli generate entity -o entity/src --lib --tables "$TABLE"
done
Edit:
I realize the issue here is not which tables I am trying to gather but rather just the schema discovery's query is taking a long time. Would it be possible to take the approved table list (given to CLI above) and use it to filter the discovery piece?
Ah, I think I tracked it down.
In sea-schema, we are doing future::join_all()
on potentially a very large list. They all end up rushing to the executor to get a connection to execute their query. The default cli configis to only have a single connection so they're all just waiting for that one connection to be available. So the pool times out for some of the later ones. The fix was to set --max-connections 50
to allow a higher throughput and avoid the timeout. A lower number probably would've worked too but that's what did it for me.
@jared-mackey thank you for pointing it out. It seems like we should not join_all more than the pool size in SeaSchema.
I imagine we can chunk the vector by the pool size.
for chunk in vec.iter().chunks(pool_size) {
join_all(op(chunk)).await?;
}
@tyt2y3 Seems like a good approach.
A couple of thoughts on the issue after looking into it a bit.
There is the join all problem that should be solved. However, that seems to be exacerbated by these
- It seems that tables set to ignore via the CLI argument are still pulled via the discovery process, they're just not generated in the output. There is probably a good reason for this, but thought I'd mention it.
- And to tie it back to #1582 a bit, in my case all the partitions were the same schema and I don't care about anything other than the super table. But I do believe there are cases where partitions can be different than the super schema so maybe we cannot optimize those out. Would need to do some research to see if they have a flag or something where we can tell if the partitions are the same as the super or not. If they are, we shouldn't pull them at all I don't believe.