Enable `COPY` for seed and migrations
Describe the bug
INSERT INTO is slow for data dump in comparison to COPY when you need to scale, COPY is also the default export with pg_dump for postgres.
Migrations and seed.sql that contains COPY statement fails.
To Reproduce Steps to reproduce the behavior:
- Create a migration with the
COPYstatement to feed your DB
COPY auth.users (instance_id, id, aud, role, email, encrypted_password, email_confirmed_at, invited_at, confirmation_token, confirmation_sent_at, recovery_token, recovery_sent_at, email_change_token_new, email_change, email_change_sent_at, last_sign_in_at, raw_app_meta_data, raw_user_meta_data, is_super_admin, created_at, updated_at, phone, phone_confirmed_at, phone_change, phone_change_token, phone_change_sent_at, email_change_token_current, email_change_confirm_status, banned_until, reauthentication_token, reauthentication_sent_at, is_sso_user, deleted_at, is_anonymous) FROM stdin;
00000000-0000-0000-0000-000000000000 e3dc0b91-c49a-47f8-8421-79001580962b authenticated authenticated [email protected] $2a$10$HVoLo3KAEmuKkr.3hnARPOubNZOrDCxROmo31AX7rRNCOKsmPBcdq 2024-06-08 17:41:47.920181+00 \N 2024-06-08 17:41:23.570749+00 \N \N 2024-06-08 17:42:23.982138+00 {"provider": "email", "providers": ["email"]} {"sub": "e3dc0b91-c49a-47f8-8421-79001580962b", "email": "[email protected]", "email_verified": false, "phone_verified": false} \N 2024-06-08 17:41:23.553288+00 2024-06-08 17:42:23.987962+00 \N \N \N 0 \N \N f \N f
\.
supabase db reset
Expected behavior Successfull migration
Screenshots
Applying migration 20240827020350_my_migration.sql...
FATAL: terminating connection because protocol synchronization was lost (SQLSTATE 08P01)
At statement 10: --
-- TOC entry 3955 (class 0 OID 29188)
-- Dependencies: 287
-- Data for Name: users; Type: TABLE DATA; Schema: auth; Owner: supabase_auth_admin
--
COPY auth.users (instance_id, id, aud, role, email, encrypted_password, email_confirmed_at, invited_at, confirmation_token, confirmation_sent_at, recovery_token, recovery_sent_at, email_change_token_new, email_change, email_change_sent_at, last_sign_in_at, raw_app_meta_data, raw_user_meta_data, is_super_admin, created_at, updated_at, phone, phone_confirmed_at, phone_change, phone_change_token, phone_change_sent_at, email_change_token_current, email_change_confirm_status, banned_until, reauthentication_token, reauthentication_sent_at, is_sso_user, deleted_at, is_anonymous) FROM stdin
System information
Rerun the failing command with --create-ticket flag.
- Ticket ID:
f06acc6824804d0a8044a1a8bd6d1108 - Version of OS:
MacOS 14.1.1 - Version of CLI:
1.190.0 - Version of Docker:
Docker version 27.1.1, build 6312585 - Versions of services:
N/A
Can you share the time it took for you and the size of database you are seeding? The default multiline insert statements emitted by supabase db dump is very fast for seeding small datasets (10k+ rows).
Unfortunately the syntax of COPY ... FROM STDIN; ... \. is specific to psql client. It will take some engineering effort to build it with pgx while handling all the edge cases around sql parsing. We could also explore using psql directly to support directory of fixtures https://github.com/supabase/cli/issues/2026.
That said, the current recommendation for seeding larger datasets is to use supabase db dump --use-copy to emit sql with copy statements, followed by running psql -f seed.sql separately.
Understood, thank you for this quick answer 🥇
I'm seeding 400k+ rows the problem is when you export the data, the database using COPY statement is fast, while pg_dump --inserts is slower. It's also a bit less readable, IMO.
We can mix them, but it's less convenient than trigger a supabase db reset to be sure that everything works as intended rather than making an external custom script that combine it with psql.
PS: Is there a way to split the seed.sql across multiple files, actually I'm creating fixtures into migrations folder to achieve it.
PS: Is there a way to split the
seed.sqlacross multiple files, actually I'm creatingfixturesintomigrationsfolder to achieve it.
Hey ! We just merged https://github.com/supabase/cli/pull/2702 that should help with that.
great thank you @avallete