dolt icon indicating copy to clipboard operation
dolt copied to clipboard

`dolt dump` should batch inserts

Open timsehn opened this issue 2 years ago • 1 comments

Right now dolt dump makes an individual insert for each row in the database. This is especially bad for reimporting into Dolt.

Here's an example:

$ cat doltdump.sql 
SET FOREIGN_KEY_CHECKS=0;
SET UNIQUE_CHECKS=0;
DROP TABLE IF EXISTS `deleted_employees`;
CREATE TABLE `deleted_employees` (
  `id` int NOT NULL,
  `last_name` varchar(100),
  `first_name` varchar(100),
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_bin;
INSERT INTO `deleted_employees` (`id`,`last_name`,`first_name`) VALUES (0,'Sehn','Tim');
DROP TABLE IF EXISTS `employees`;
CREATE TABLE `employees` (
  `id` int NOT NULL,
  `last_name` varchar(100),
  `first_name` varchar(100),
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_bin;
INSERT INTO `employees` (`id`,`last_name`,`first_name`) VALUES (1,'Hendriks','Brian');
INSERT INTO `employees` (`id`,`last_name`,`first_name`) VALUES (2,'Son','Aaron');
CREATE TRIGGER `soft_delete` after delete on employees for each row insert into deleted_employees values (old.id, old.last_name, old.first_name);

We should use the approach used here to batch up inserts on dump:

https://github.com/dolthub/insert-batcher

timsehn avatar Dec 08 '22 21:12 timsehn

To be clear, the two inserts in the above dump should be one that looks like:

INSERT INTO `employees` (`id`,`last_name`,`first_name`) VALUES (1,'Hendriks','Brian'), (2,'Son','Aaron');

timsehn avatar Dec 08 '22 23:12 timsehn

dolt dump already has a --batch option that I think accomplishes what you want https://docs.dolthub.com/cli-reference/cli#dolt-dump

jycor avatar Dec 14 '22 18:12 jycor

That should just be the default.

timsehn avatar Dec 14 '22 18:12 timsehn