pgsync icon indicating copy to clipboard operation
pgsync copied to clipboard

Bootstrap very slow when there is an irrelevant schema with many tables

Open d2a-raudenaerde opened this issue 4 years ago • 2 comments

PGSync version: 2.1.1

Postgres version: 12.5

Python version: 3.8

Problem Description:

I try running bootstrap on a database with 3 schema's. One schema has >1000 tables. I want to run on the other custom schema; but the code inspects ALL tables in ALL schemas, which is very slow, and I think, unnecessary. Modifying the code to use only the specific schema, it speeds up dramatically.

sync.py line 168 and 205 seem relevant.

d2a-raudenaerde avatar Aug 04 '21 12:08 d2a-raudenaerde

pgsync is supposed to work across database schemas. So this is supporting a functionality. That said there is probably a more efficient way of skipping non-relevant tables.

toluaina avatar Aug 09 '21 19:08 toluaina

I am running into a similar problem, as I have a database with many schemas but only one schema relevant to pgsync and accessible by a service account. When pgsync tries to create the table_notify() function, it errors with InsufficientPrivilege.

A solution here would be to have an optional top level field schemas that lets us specify which schemas we'll interact with. If missing, it defaults to the current behavior: self.__schemas = sa.inspect(self.engine).get_schema_names()

loren avatar Sep 15 '22 21:09 loren