pgloader
pgloader copied to clipboard
DEFAULT value clause not being preserved for bit(1) / boolean columns during MySQL to Postgres migration
- [X] pgloader --version
pgloader version "3.6.3047c9a"
compiled with SBCL 2.0.1.debian
- [X] did you test a fresh compile from the source tree?
Yes.
- [X] did you search for other similar issues?
Yes.
- [X] how can I reproduce the bug?
PGLoader configuration:
LOAD DATABASE
FROM mysql://mysql:mysql@localhost:3306/testdb
INTO postgresql://postgres:postgres@localhost:5432/testdb
BEFORE LOAD DO
$$ CREATE SCHEMA IF NOT EXISTS testschema; $$;
mysqldump
of testdb
from the MySQL instance:
-- MySQL dump 10.13 Distrib 8.0.23, for osx10.16 (x86_64)
--
-- Host: 127.0.0.1 Database: testdb
-- ------------------------------------------------------
-- Server version 5.5.5-10.1.38-MariaDB-1~bionic
/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!50503 SET NAMES utf8mb4 */;
/*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
/*!40103 SET TIME_ZONE='+00:00' */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;
--
-- Table structure for table `testtable`
--
DROP TABLE IF EXISTS `testtable`;
/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!50503 SET character_set_client = utf8mb4 */;
CREATE TABLE `testtable` (
`boolwithoutdefault` bit(1) DEFAULT NULL,
`boolwithfalsedefault` bit(1) DEFAULT b'0',
`boolwithtruedefault` bit(1) DEFAULT b'1'
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
/*!40101 SET character_set_client = @saved_cs_client */;
--
-- Dumping data for table `testtable`
--
LOCK TABLES `testtable` WRITE;
/*!40000 ALTER TABLE `testtable` DISABLE KEYS */;
/*!40000 ALTER TABLE `testtable` ENABLE KEYS */;
UNLOCK TABLES;
mysqldump: Couldn't execute 'SELECT COLUMN_NAME, JSON_EXTRACT(HISTOGRAM, '$."number-of-buckets-specified"') FROM information_schema.COLUMN_STATISTICS WHERE SCHEMA_NAME = 'testdb' AND TABLE_NAME = 'testtable';': Unknown table 'COLUMN_STATISTICS' in information_schema (1109)
- [X] pgloader output you obtain
pgloader version 3.6.3047c9a
compiled with SBCL 2.0.1.debian
sb-impl::*default-external-format* :ANSI_X3.4-1968
tmpdir: #P"/tmp/pgloader/"
2021-05-03T13:41:45.008000Z NOTICE Starting pgloader, log system is ready.
2021-05-03T13:41:45.018000Z INFO Starting monitor
2021-05-03T13:41:45.021000Z LOG pgloader version "3.6.3047c9a"
2021-05-03T13:41:45.052000Z INFO Parsed command:
LOAD DATABASE
FROM mysql://root:mysqlroot@vanaheim:3306/testdb
INTO postgresql://postgres:postgrespassword@vanaheim:5432/testdb
BEFORE LOAD DO
$$ CREATE SCHEMA IF NOT EXISTS testschema; $$;
2021-05-03T13:41:45.107000Z DEBUG CONNECTED TO #<PGLOADER.PGSQL:PGSQL-CONNECTION pgsql://postgres@vanaheim:5432/testdb {1007B4A273}>
2021-05-03T13:41:45.108000Z DEBUG SET client_encoding TO 'utf8'
2021-05-03T13:41:45.111000Z DEBUG SET application_name TO 'pgloader'
2021-05-03T13:41:45.131000Z NOTICE Executing SQL block for before load
2021-05-03T13:41:45.139000Z DEBUG CONNECTED TO #<PGLOADER.PGSQL:PGSQL-CONNECTION pgsql://postgres@vanaheim:5432/testdb {1007B4A273}>
2021-05-03T13:41:45.140000Z DEBUG SET client_encoding TO 'utf8'
2021-05-03T13:41:45.144000Z DEBUG SET application_name TO 'pgloader'
2021-05-03T13:41:45.156000Z DEBUG BEGIN
2021-05-03T13:41:45.159000Z SQL CREATE SCHEMA IF NOT EXISTS testschema;
2021-05-03T13:41:45.175000Z LOG Migrating from #<MYSQL-CONNECTION mysql://root@vanaheim:3306/testdb {1007B48AC3}>
2021-05-03T13:41:45.175000Z LOG Migrating into #<PGSQL-CONNECTION pgsql://postgres@vanaheim:5432/testdb {1007B4A273}>
2021-05-03T13:41:45.224000Z DEBUG CONNECTED TO #<MYSQL-CONNECTION mysql://root@vanaheim:3306/testdb {1007B48AC3}>
2021-05-03T13:41:45.232000Z SQL MySQL: sending query: -- params: db-name
-- table-type-name
-- only-tables
-- only-tables
-- including
-- filter-list-to-where-clause incuding
-- excluding
-- filter-list-to-where-clause excluding
select c.table_name, t.table_comment,
c.column_name, c.column_comment,
c.data_type, c.column_type, c.column_default,
c.is_nullable, c.extra
from information_schema.columns c
join information_schema.tables t using(table_schema, table_name)
where c.table_schema = 'testdb' and t.table_type = 'BASE TABLE'
order by table_name, ordinal_position;
2021-05-03T13:41:45.258000Z SQL MySQL: sending query: -- params: db-name
-- including
-- filter-list-to-where-clause incuding
-- excluding
-- filter-list-to-where-clause excluding
SELECT table_name,
coalesce(cast(data_length/avg_row_length as unsigned), 0)
FROM information_schema.tables
WHERE table_schema = 'testdb'
and table_type = 'BASE TABLE'
;
2021-05-03T13:41:45.264000Z SQL MySQL: sending query: -- params: db-name
-- table-type-name
-- only-tables
-- only-tables
-- including
-- filter-list-to-where-clause incuding
-- excluding
-- filter-list-to-where-clause excluding
SELECT s.table_name, s.constraint_name, s.ft, s.cols, s.fcols,
rc.update_rule, rc.delete_rule
FROM
(
SELECT tc.table_schema, tc.table_name,
tc.constraint_name, k.referenced_table_name ft,
group_concat( k.column_name
order by k.ordinal_position) as cols,
group_concat( k.referenced_column_name
order by k.position_in_unique_constraint) as fcols
FROM information_schema.table_constraints tc
LEFT JOIN information_schema.key_column_usage k
ON k.table_schema = tc.table_schema
AND k.table_name = tc.table_name
AND k.constraint_name = tc.constraint_name
WHERE tc.table_schema = 'testdb'
AND k.referenced_table_schema = 'testdb'
AND tc.constraint_type = 'FOREIGN KEY'
GROUP BY tc.table_schema, tc.table_name, tc.constraint_name, ft
) s
JOIN information_schema.referential_constraints rc
ON rc.constraint_schema = s.table_schema
AND rc.constraint_name = s.constraint_name
AND rc.table_name = s.table_name;
2021-05-03T13:41:45.274000Z SQL MySQL: sending query: -- params: db-name
-- table-type-name
-- only-tables
-- only-tables
-- including
-- filter-list-to-where-clause incuding
-- excluding
-- filter-list-to-where-clause excluding
SELECT table_name, index_name, index_type,
sum(non_unique),
cast(GROUP_CONCAT(column_name order by seq_in_index) as char)
FROM information_schema.statistics
WHERE table_schema = 'testdb'
GROUP BY table_name, index_name, index_type;
2021-05-03T13:41:45.297000Z INFO Processing source catalogs
2021-05-03T13:41:45.310000Z NOTICE Prepare PostgreSQL database.
2021-05-03T13:41:45.318000Z DEBUG CONNECTED TO #<PGLOADER.PGSQL:PGSQL-CONNECTION pgsql://postgres@vanaheim:5432/testdb {1007B4A273}>
2021-05-03T13:41:45.318000Z DEBUG SET client_encoding TO 'utf8'
2021-05-03T13:41:45.321000Z DEBUG SET application_name TO 'pgloader'
2021-05-03T13:41:45.330000Z DEBUG BEGIN
2021-05-03T13:41:45.379000Z SQL DROP TABLE IF EXISTS testdb.testtable CASCADE;
2021-05-03T13:41:45.391000Z SQL CREATE TABLE testdb.testtable
(
boolwithoutdefault boolean,
boolwithfalsedefault boolean default NULL,
boolwithtruedefault boolean default NULL
);
2021-05-03T13:41:45.406000Z SQL -- params: table-names
select n, n::regclass::oid
from (values ('testdb.testtable')) as t(n);
2021-05-03T13:41:45.435000Z NOTICE COPY testdb.testtable with 0 rows estimated [0/4]
2021-05-03T13:41:45.435000Z DEBUG Reader started for testdb.testtable
2021-05-03T13:41:45.452000Z DEBUG start testdb.testtable 440
2021-05-03T13:41:45.452000Z INFO COPY ON ERROR STOP
2021-05-03T13:41:45.456000Z DEBUG CONNECTED TO #<PGLOADER.PGSQL:PGSQL-CONNECTION pgsql://postgres@vanaheim:5432/testdb {100B7D9823}>
2021-05-03T13:41:45.456000Z DEBUG SET client_encoding TO 'utf8'
2021-05-03T13:41:45.461000Z DEBUG SET application_name TO 'pgloader'
2021-05-03T13:41:45.468000Z DEBUG CONNECTED TO #<MYSQL-CONNECTION mysql://root@vanaheim:3306/testdb {100B4C7CF3}>
2021-05-03T13:41:45.468000Z SQL MySQL: sending query: SELECT `boolwithoutdefault`, `boolwithfalsedefault`, `boolwithtruedefault` FROM `testtable`
2021-05-03T13:41:45.470000Z SQL SET search_path TO testdb;
2021-05-03T13:41:45.471000Z DEBUG Reader for testdb.testtable is done in 0.037000s
2021-05-03T13:41:45.472000Z DEBUG Finished processing READER for "testdb.testtable" 0.037000s
2021-05-03T13:41:45.475000Z INFO pgsql:copy-rows-from-queue[2]: testdb.testtable (boolwithoutdefault
boolwithfalsedefault
boolwithtruedefault)
2021-05-03T13:41:45.489000Z DEBUG stop testdb.testtable | 440 .. 494 = 0.054d0
2021-05-03T13:41:45.490000Z DEBUG Writer[2] for testdb.testtable is done in 0.000000s
2021-05-03T13:41:45.490000Z DEBUG Finished processing WRITER for "testdb.testtable" 0.000000s
2021-05-03T13:41:45.490000Z DEBUG writers-counts[testdb.testtable] = 0
2021-05-03T13:41:45.493000Z NOTICE DONE copying testdb.testtable in 0.000s
2021-05-03T13:41:45.494000Z INFO Done with COPYing data, waiting for indexes
2021-05-03T13:41:45.494000Z INFO Done waiting for indexes
2021-05-03T13:41:45.508000Z NOTICE Completing PostgreSQL database.
2021-05-03T13:41:45.508000Z NOTICE Reset sequences
2021-05-03T13:41:45.519000Z DEBUG CONNECTED TO #<PGLOADER.PGSQL:PGSQL-CONNECTION pgsql://postgres@vanaheim:5432/testdb {100C0058C3}>
2021-05-03T13:41:45.519000Z DEBUG SET client_encoding TO 'utf8'
2021-05-03T13:41:45.524000Z DEBUG SET application_name TO 'pgloader'
2021-05-03T13:41:45.531000Z DEBUG SET client_encoding TO 'utf8'
2021-05-03T13:41:45.534000Z DEBUG SET application_name TO 'pgloader'
2021-05-03T13:41:45.570000Z DEBUG CONNECTED TO #<PGLOADER.PGSQL:PGSQL-CONNECTION pgsql://postgres@vanaheim:5432/testdb {100C011E43}>
2021-05-03T13:41:45.570000Z DEBUG SET client_encoding TO 'utf8'
2021-05-03T13:41:45.576000Z DEBUG SET application_name TO 'pgloader'
2021-05-03T13:41:45.595000Z DEBUG BEGIN
2021-05-03T13:41:45.614000Z LOG report summary reset
table name errors read imported bytes total time read write
----------------------- --------- --------- --------- --------- -------------- --------- ---------
before load 0 1 1 0.038s
fetch meta data 0 1 1 0.106s
Create Schemas 0 0 0 0.004s
Create SQL Types 0 0 0 0.011s
Create tables 0 2 2 0.041s
Set Table OIDs 0 1 1 0.013s
----------------------- --------- --------- --------- --------- -------------- --------- ---------
testdb.testtable 0 0 0 0.054s 0.037s
----------------------- --------- --------- --------- --------- -------------- --------- ---------
COPY Threads Completion 0 4 4 0.058s
Index Build Completion 0 0 0 0.000s
Reset Sequences 0 0 0 0.046s
Primary Keys 0 0 0 0.000s
Create Foreign Keys 0 0 0 0.000s
Create Triggers 0 0 0 0.006s
Install Comments 0 0 0 0.000s
----------------------- --------- --------- --------- --------- -------------- --------- ---------
Total import time ? 0 0 0.110s
2021-05-03T13:41:45.649000Z INFO Stopping monitor
- [X] data that is being loaded, if relevant
pg_dump
of testdb
database in Postgres created by pgloader
:
--
-- PostgreSQL database dump
--
-- Dumped from database version 11.11 (Debian 11.11-1.pgdg90+1)
-- Dumped by pg_dump version 13.2
SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET xmloption = content;
SET client_min_messages = warning;
SET row_security = off;
--
-- Name: testdb; Type: SCHEMA; Schema: -; Owner: postgres
--
CREATE SCHEMA testdb;
ALTER SCHEMA testdb OWNER TO postgres;
--
-- Name: testschema; Type: SCHEMA; Schema: -; Owner: postgres
--
CREATE SCHEMA testschema;
ALTER SCHEMA testschema OWNER TO postgres;
SET default_tablespace = '';
--
-- Name: testtable; Type: TABLE; Schema: testdb; Owner: postgres
--
CREATE TABLE testdb.testtable (
boolwithoutdefault boolean,
boolwithfalsedefault boolean,
boolwithtruedefault boolean
);
ALTER TABLE testdb.testtable OWNER TO postgres;
--
-- Data for Name: testtable; Type: TABLE DATA; Schema: testdb; Owner: postgres
--
COPY testdb.testtable (boolwithoutdefault, boolwithfalsedefault, boolwithtruedefault) FROM stdin;
\.
--
-- PostgreSQL database dump complete
--
Hello!
I'm running into an issue when using pgloader
to migrate a MariaDB 10.1.38
database to a Postgres 11
database. In the source MySQL database, there are several tables that have columns defined as bit(1)
type, so these are becoming boolean
columns when migrated to Postgres. This works fine, but if the source columns also have a DEFAULT
value clause specified, that information appears to get lost in the target Postgres database.
The sample databases shown above are the simplest examples of this behaviour that I could come up with. The mysqldump
output shows that I've created a very silly table:
CREATE TABLE `testtable` (
`boolwithoutdefault` bit(1) DEFAULT NULL,
`boolwithfalsedefault` bit(1) DEFAULT b'0',
`boolwithtruedefault` bit(1) DEFAULT b'1'
)
But the pgloader
output seems to show that it knows that there are DEFAULT
values specified, but they're not being propagated forward to the Postgres database:
2021-05-03T13:41:45.391000Z SQL CREATE TABLE testdb.testtable
(
boolwithoutdefault boolean,
boolwithfalsedefault boolean default NULL,
boolwithtruedefault boolean default NULL
);
I'm not sure if there's an easy workaround for this or not? There might be a way to add CAST
rules that workaround this, but I'm not sure. In the actual database that I'm migrating, there are a large number of tables with bit(1)
columns, and only some of them have DEFAULT
values specified, and some default to true
and some default to false
- ideally I'd like to convert them all correctly and easily at once. ð
I hit the same problem, and prepared a GitHub project to reproduce it, before I found this already existing issue.
Any advice on how to configure pgloader (or a fix to it), so that MySQL bit
colums' default values are migrated to Postgres boolean
columns would be highly appreciated 🙂
I got the same issue as you, my column in MySQL is in bit(1)
, and converting to postgres using pgloader creates a column as boolean not null default NULL
.
Here is what I just did to put a default value in postgres. In your .load file, add this at the end:
...
AFTER CREATE SCHEMA DO
$$ alter table <schema>.<table_name> alter column <column_name> set default false; $$
...
My column in postgres now has a default of false. Let me know if this works for you as it did for me.
Here is how to fix it properly when mysqldump
has default of b'0'
or b'1'
for bit(1)
columns:
- copy the transforms.lisp from the repo to your folder where you have your .load file
- add these lines to the
transforms.lisp
file as follow:
...
(defun bits-to-boolean2 (bit-vector)
"When using MySQL, strange things will happen, like encoding booleans into
bit(1). Of course PostgreSQL wants 'f' and 't'."
(when (and bit-vector (= 4 (length bit-vector)))
(if (string= "b'0'" bit-vector) "f" "t")))
...
- add this line to your .load file:
CAST type bit when (= 1 precision) to boolean drop typemod using bits-to-boolean2
- run command
pgloader --load-lisp-file transforms.lisp <your_file>.load
Note: sorry I don't have much experience with lisp language. If someone finds a better way to include with bits-to-boolean
, please let us know. Then we would not need to add the above line in the .load file.
(cc: @dimitri - Thank you so much for your work on this great tool)
add to @mstephano 's answer
if you have another bit type error you should add bits-to-boolean rule to bits-to-boolean2 rule
this work fine to me
(defun bits-to-boolean2 (bit-vector)
"When using MySQL, strange things will happen, like encoding booleans into
bit(1). Of course PostgreSQL wants 'f' and 't'."
(cond ((and bit-vector (= 1 (length bit-vector)))
(let ((bit (aref bit-vector 0)))
(etypecase bit
(fixnum (if (= 0 bit) "f" "t"))
(character (if (= 0 (char-code bit)) "f" "t")))))
((and bit-vector (= 4 (length bit-vector)))
(if (string= "b'0'" bit-vector) "f" "t"))
(t nil)))
This is what worked for me by using the --load-lisp-file
option:
(in-package :pgloader.transforms)
(defun bits-to-boolean (bit-vector)
"Converts MySQL bit(1) to PostgreSQL boolean 'f' or 't'."
(cond
;; numeric 0 or 1
((and (numberp bit-vector) (member bit-vector '(0 1)))
(if (zerop bit-vector) "f" "t"))
;; single-character '0' or '1'
((and (stringp bit-vector) (= (length bit-vector) 1))
(case (aref bit-vector 0)
(#\0 "f")
(#\1 "t")))
;; binary string representation "b'0'" or "b'1'"
((string= "b'0'" bit-vector) "f")
((string= "b'1'" bit-vector) "t")
;; no match
(t nil)))
I do get a warning like WARNING: redefining PGLOADER.TRANSFORMS::BITS-TO-BOOLEAN in DEFUN
but otherwise it works well.