`COPY`ing increasing amount of data from postgres metadata store as DB grows
I have a large number (~250) of writers connecting to my Postgres+S3 DuckLake instance and inserting rows.
Looking at my RDS Postgres metrics, I see an increasing amount of data being scanned:
The statement stats show it mostly being a COPY statement:
Is it necessary to copy this many rows out to attach and insert data? Thanks!
If we create indexes for those meta tables, the scan IO should be much better.
Thanks for the report!
This is caused by the extension not yet sending the queries to be executed in Postgres, but instead fetching the table contents and running the queries in DuckDB, see my comment here. This is not required and we plan to add support for directly running these queries in Postgres instead in the near future.
I see, thanks!