pglite icon indicating copy to clipboard operation
pglite copied to clipboard

Initial Bulk Loading Big Database With Indexing

Open bishoyroufael opened this issue 4 months ago • 5 comments

Hi!

I am relying on the library to load some data in a PWA app which bulk loads large amount of data using the COPY command (with .csv files). I have a specific table i.e big_table which has roughly 120k rows and I have a pg_trgm index on it for fuzzy matching on the client side.

CREATE INDEX IF NOT EXISTS trgm_idx_slide_big_table
ON big_table
USING GIST ((
  coalesce(remove_diacritics(content), '') || ' ' ||
  coalesce(remove_diacritics(header), '')) gist_trgm_ops );

The current problem is that indexing alone takes around the same amount of time of bulk importing the data. The data import takes around ~10s and the Indexing takes around ~14s.

The whole database is around ~130 MB. Is there any other efficient way that I could use for faster insertion initially?

bishoyroufael avatar Aug 06 '25 18:08 bishoyroufael

Hey @bishoyroufael could you please benchmark it against a PostgreSQL installation on the same/similar machine?

tdrz avatar Aug 07 '25 05:08 tdrz

PostgreSQL locally works much quicker than using pglite in the browser. I am actually using pglite by loading it dynamically in my react-native app (i.e since there's no official support for it now in react-native) using:

  const fetchedWasm = await fetch(wasmAsset.localUri!);
  const wasmModule = await WebAssembly.compileStreaming(fetchedWasm);
  const client = new PGlite({
    // debug: 5,
    wasmModule,
    fsBundle: await fetchedFsBundle.blob(),
    dataDir: "idb://hymnos-pgdata",
    extensions: { pg_trgm },
    relaxedDurability: true,
  });

Should I even expect that pglite give the same performance as running PostgreSQL locally?

bishoyroufael avatar Aug 11 '25 07:08 bishoyroufael

Should I even expect that pglite give the same performance as running PostgreSQL locally?

Definitely not, but we are always interested in making it faster and having some real-life apps benchmarks helps us with this.

dataDir: "idb://hymnos-pgdata",

This is most probably one cause of the slowness that you are seeing. We have ideas on how to improve this, but it is not a priority at the moment.

tdrz avatar Aug 11 '25 08:08 tdrz

Wouldn't relaxedDurability: true eliminate the the slowness of indexedDB in the browser since persistence occurs asynchronously?

bishoyroufael avatar Aug 11 '25 09:08 bishoyroufael

Partially. Could you try it without persistence just to get an idea?

tdrz avatar Aug 11 '25 09:08 tdrz