edgedb-cli icon indicating copy to clipboard operation
edgedb-cli copied to clipboard

Allow to dump data only

Open extsoft opened this issue 1 year ago • 2 comments

There is an edgedb dump command that does a full dump of the database(s). edgedb restore command could take a generated dump file(s) and apply them to the empty database. However, there is no way to dump (or restore) data only.

What did I try?

  1. A *.dump file provides the EdgeQL code for the schema, however, keeps data in binary format - no way to apply those data.
  2. https://www.edgedb.com/docs/changelog/3_x#sql-support states the SQL support. However, using pg_dump gives a relation representation of the data that is (let’s say) impossible to apply to an EdgeDB instance.

As a solution, I propose to add --data-only option to the edgedb dump command that allows getting EdgeQL data insert queries.

There is also the related question: why does edgedb dump keeps data in the binary form? As far as I see, a —format binary|edgeql option will make sense to provide a consistent output regardless of what’s being dumped: all, data only, etc.

extsoft avatar Aug 08 '23 17:08 extsoft

I've tried generating dumps like this myself but EdgeDB doesn't seem to be able to handle .edgeql files with thousands of inserts. Separate queries like

insert SomeObject {
  title := '...',
  body := '...',
};
insert SomeObject {
  title := '...',
  body := '...',
};

takes about a second each which is way too slow, and a couple hundred MB of JSON in

with objects := to_json("[...]"),
for object in json_array_unpack(objects) union (
  insert SomeObject {
    title := <str>object['title'],
    body := <str>object['body'],
  }
)

makes the python process sit at like 10% cpu for a few minutes until the server runs out of memory and crashes.

KaelWD avatar Nov 18 '23 08:11 KaelWD

I know this hasn't seen any movement, but putting in a +1 for --format=edgeql!

tomnz avatar Mar 01 '24 21:03 tomnz