Allow to dump data only
There is an edgedb dump command that does a full dump of the database(s).
edgedb restore command could take a generated dump file(s) and apply them to the empty database. However, there is no way to dump (or restore) data only.
What did I try?
- A
*.dumpfile provides the EdgeQL code for the schema, however, keeps data in binary format - no way to apply those data. - https://www.edgedb.com/docs/changelog/3_x#sql-support states the SQL support. However, using
pg_dumpgives a relation representation of the data that is (let’s say) impossible to apply to an EdgeDB instance.
As a solution, I propose to add --data-only option to the edgedb dump command that allows getting EdgeQL data insert queries.
There is also the related question: why does edgedb dump keeps data in the binary form? As far as I see, a —format binary|edgeql option will make sense to provide a consistent output regardless of what’s being dumped: all, data only, etc.
I've tried generating dumps like this myself but EdgeDB doesn't seem to be able to handle .edgeql files with thousands of inserts. Separate queries like
insert SomeObject {
title := '...',
body := '...',
};
insert SomeObject {
title := '...',
body := '...',
};
takes about a second each which is way too slow, and a couple hundred MB of JSON in
with objects := to_json("[...]"),
for object in json_array_unpack(objects) union (
insert SomeObject {
title := <str>object['title'],
body := <str>object['body'],
}
)
makes the python process sit at like 10% cpu for a few minutes until the server runs out of memory and crashes.
I know this hasn't seen any movement, but putting in a +1 for --format=edgeql!