npgsql icon indicating copy to clipboard operation
npgsql copied to clipboard

Generate all hard-coded type OIDs for NoTypeLoading

Open Shawson opened this issue 2 years ago • 5 comments

We're having issues with an ETL application we've written which establishes lots of connections to lots of different Postgres databases. During this process, the constant closing and reopening of connections to fresh databases seems to be causing issues with the initial type loading slowing down the connection process. As a result we have added Server Compatibility Mode=NoTypeLoading; to our connection string which speeds up connections. However we then want to use integer[] type in our queries.

Is there a way to manually register this type, allowing us to still ue int[] in Postgres without having to enable type loading? We've looked through the npgsql repo and can see lots of classes that should enable this, but most are internal or have protected constructors and it's not obvious which builder or factory classes we should be using to build them.

Npgsql version: 5.0.7 PostgreSQL version: PostgreSQL 12.1, compiled by Visual C++ build 1914, 64-bit Operating system: Linux

Shawson avatar Jul 28 '21 11:07 Shawson

Unfortunately, I don't think it's possible to manually map an array type using public-facing user APIs...

Just to make sure we're on the same page, type loading occurs only the first time you connect to a database (i.e. host/port/database connection string combination) - are you really connecting to that many different databases? Also, type loading is typically quite fast, unless there's a large amount of types defined in the database (or the perf requirements are extremely high). Are you sure this is your case?

Note that we should be able to have arrays in NoTypeLoading mode, since their type OIDs are also hardcoded in PostgreSQL - @Brar was looking into doing this. I'll keep this issue open for tracking that.

roji avatar Jul 29 '21 08:07 roji

Apparently we hit some similar issues today around this with the opposite extreme from the OP. Type loading noticeably affected performance for us on a single highly sharded db with 100k schemas with the type loading query apparently taking between 1-2 seconds. This was exacerbated by it being called from a Function-as-a-Service context where processes are spun up (and type loading performed again) as load increased. I'm told the current behavior of NoTypeLoading didn't work for us due to needing to parameterize arrays in queries.

Since this is in the backlog, would you be willing to accept a PR from myself or a community member if this could be improved in some way?

stevenaw avatar Apr 15 '22 01:04 stevenaw

As multiple people are hitting this, we should look into supporting all the built-in PostgreSQL types with NoTypeLoading; this would mean all types where the type OID is hard-coded in the PostgreSQL source (see this). We could manually maintain PostgresTypeOIDs.cs to mirror pg_type.dat, or we could use a source generator to generate it (possibly even fetching pg_type.dat as part of our CI). Once we do this, the only types which wouldn't be supported with NoTypeLoading would be types created by extensions (e.g. PostGIS) and user-created types (e.g. enums, composites).

@Brar at some point you expressed interest in doing this, are you still interested?

roji avatar Apr 15 '22 09:04 roji

@Brar at some point you expressed interest in doing this, are you still interested?

Yes I am. In fact I have some promising (but currently abandoned) attempt for a source generator somewhere that (like PostgreSQL's genbki.pl) attempts to parse the *.dat (IIRC there were relevant parts beyond pg_type.dat) files in postgres/src/include/catalog and from that should generate source code for Npgsql. Since It has turned out that I have too much on my plate right now to do all the things I'm interested in I'm happy to get out of the way if this gets urgent. I didn't even find the code I have already written but I may possibly find it if I invest a bit more time.

Brar avatar Apr 15 '22 11:04 Brar

Sure - I don't think there's any urgency here, I don't think we should do this for a 6.0 patch release in any case. I've assigned this to you for 7.0, we can revisit later if you have no time for it.

roji avatar Apr 15 '22 11:04 roji

Design decision: the value of generating the type OIDs from PG does not justify the complexity/fragility here. We're also assuming that PG will never change any existing OIDs, only add new ones. And at that point, for the OID to be useful we need to write a type handler for it.

roji avatar Sep 11 '22 15:09 roji

We've recently started the process of moving to npgsql@7 and this has proven to be a very helpful feature. Thanks once again

stevenaw avatar Aug 07 '23 16:08 stevenaw

Great to hear, thanks for letting us know @stevenaw!

roji avatar Aug 07 '23 17:08 roji