py-mysql2pgsql icon indicating copy to clipboard operation
py-mysql2pgsql copied to clipboard

psycopg2.InterfaceError: can't encode unicode string to latin-1

Open mcepl opened this issue 10 years ago • 12 comments

With MySQL-python-1.2.3-0.3.c1.1.el6.i686, python-psycopg2-2.0.14-2.el6.i686, mysql-5.1.71-1.el6.i686, postgresql-8.4.18-1.el6_4.i686 (output to file)

Traceback (most recent call last):
  File "/var/lib/pgsql/py-mysql2pgsql/bin/py-mysql2pgsql", line 38, in <module>
    mysql2pgsql.Mysql2Pgsql(options).convert()
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/mysql2pgsql.py", line 31, in convert
    Converter(reader, writer, self.file_options, self.run_options.verbose).convert()
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/lib/converter.py", line 31, in convert
    self.writer.write_table(table)
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/lib/__init__.py", line 90, in decorated_function
    return f(*args, **kwargs)
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/lib/postgres_file_writer.py", line 65, in write_table
    table_sql, serial_key_sql = super(PostgresFileWriter, self).write_table(table)
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/lib/postgres_writer.py", line 177, in write_table
    primary_keys, serial_key, maxval, columns = self.table_attributes(table)
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/lib/postgres_writer.py", line 153, in table_attributes
    columns.write('  %s,\n' % self.column_description(column))
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/lib/postgres_writer.py", line 18, in column_description
    return '"%s" %s' % (column['name'], self.column_type_info(column))
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/lib/postgres_writer.py", line 107, in column_type_info
    default, column_type = get_type(column)
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/mysql2pgsql/lib/postgres_writer.py", line 40, in get_type
    default = (' DEFAULT %s' % QuotedString(column['default']).getquoted()) if t(column['default']) else None
psycopg2.InterfaceError: can't encode unicode string to latin-1
(py-mysql2pgsql)postgres@luther: py-mysql2pgsql$ 

mcepl avatar Dec 20 '13 12:12 mcepl

Configuration:

mysql:
 hostname: localhost
 port: 3306
 socket: /srv/mysql/mysql.sock
 username: root
 password: something
 database: bugs
 compress: false
destination:
 file: bugzilla-dump-pg.sql
 postgres:
  hostname: localhost
  port: 5432
  username: postgres
  password: 
  database: bugs
supress_data: false
supress_ddl: false
force_truncate: false

mcepl avatar Dec 20 '13 12:12 mcepl

Doesn't looks like a current code. Can you try this with what we have in the git or at least with the latest version?

kworr avatar Dec 20 '13 12:12 kworr

That's what I've got with pip install py-mysql2pgsql

mcepl avatar Dec 20 '13 13:12 mcepl

OK, when using git master I get this:

(py-mysql2pgsql)postgres@luther: py-mysql2pgsql$ py-mysql2pgsql --verbose
>>>>>>>>>> STARTING <<<<<<<<<<


START CREATING TABLES
  START  - CREATING TABLE attach_data
Traceback (most recent call last):
  File "/var/lib/pgsql/py-mysql2pgsql/bin/py-mysql2pgsql", line 5, in <module>
    pkg_resources.run_script('py-mysql2pgsql==0.1.6', 'py-mysql2pgsql')
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 461, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 1194, in run_script
    execfile(script_filename, namespace, namespace)
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/py_mysql2pgsql-0.1.6-py2.6.egg/EGG-INFO/scripts/py-mysql2pgsql", line 38, in <module>
    mysql2pgsql.Mysql2Pgsql(options).convert()
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/py_mysql2pgsql-0.1.6-py2.6.egg/mysql2pgsql/mysql2pgsql.py", line 31, in convert
    Converter(reader, writer, self.file_options, self.run_options.verbose).convert()
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/py_mysql2pgsql-0.1.6-py2.6.egg/mysql2pgsql/lib/converter.py", line 31, in convert
    self.writer.write_table(table)
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/py_mysql2pgsql-0.1.6-py2.6.egg/mysql2pgsql/lib/__init__.py", line 86, in decorated_function
    ret = f(*args, **kwargs)
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/py_mysql2pgsql-0.1.6-py2.6.egg/mysql2pgsql/lib/postgres_db_writer.py", line 152, in write_table
    table_sql, serial_key_sql = super(PostgresDbWriter, self).write_table(table)
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/py_mysql2pgsql-0.1.6-py2.6.egg/mysql2pgsql/lib/postgres_writer.py", line 213, in write_table
    primary_keys, serial_key, maxval, columns = self.table_attributes(table)
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/py_mysql2pgsql-0.1.6-py2.6.egg/mysql2pgsql/lib/postgres_writer.py", line 189, in table_attributes
    columns.write('  %s,\n' % self.column_description(column))
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/py_mysql2pgsql-0.1.6-py2.6.egg/mysql2pgsql/lib/postgres_writer.py", line 26, in column_description
    return '"%s" %s' % (column['name'], self.column_type_info(column))
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/py_mysql2pgsql-0.1.6-py2.6.egg/mysql2pgsql/lib/postgres_writer.py", line 126, in column_type_info
    default, column_type = get_type(column)
  File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/py_mysql2pgsql-0.1.6-py2.6.egg/mysql2pgsql/lib/postgres_writer.py", line 124, in get_type
    raise Exception('unknown %s' % column['type'])
Exception: unknown longblob
(py-mysql2pgsql)postgres@luther: py-mysql2pgsql$ 

(with or without tables created)

mcepl avatar Dec 20 '13 21:12 mcepl

This may be about issue #51.

kworr avatar Dec 23 '13 09:12 kworr

I don’t think I have any large blobs there ... it is a bugzilla instance.

mcepl avatar Dec 23 '13 15:12 mcepl

It's about types, not about big chunks of data:

File "/var/lib/pgsql/py-mysql2pgsql/lib/python2.6/site-packages/py_mysql2pgsql-0.1.6-py2.6.egg/mysql2pgsql/lib/postgres_writer.py", line 124, in get_type
  raise Exception('unknown %s' % column['type'])

There's also no correct code in referenced issue. As I need something to test my assumptions on you please my laziness by creating a good MySQL dump sample containing table with some longblob data. Or I can try to do this myself yet right now I'm a bit out of time...

kworr avatar Dec 23 '13 16:12 kworr

This http://mcepl.fedorapeople.org/tmp/bugzilla-dump.sql.bz2 is the dump of the database I have tried to convert from MySQL to Postgres (the only change is that passwords are replaced with a nonsensical text). Is this what you need?

mcepl avatar Dec 23 '13 22:12 mcepl

It would be enough to have only table with attachments as it's the one that contains longblob data. But thanks anyway. :) I'll take a look.

PS: Dump can be already removed, I made myself a copy so there's no need to expose it anymore.

kworr avatar Dec 24 '13 14:12 kworr

Any updates on this? I have encountered the same issue while trying to migrate a RequestTracker database

johndoe123456 avatar Jan 21 '14 20:01 johndoe123456

I'm sorry, I'm just switching jobs and I hadn't enough time to invest in this one. The time mostly needed to install exactly the same software/os stack because I can't reproduce this one on my test machine.

I have: py27-MySQLdb-1.2.3_3, python27-2.7.6_4, py27-psycopg2-2.5.2, mariadb55-server-5.5.35, postgresql93-server-9.3.3.

And this table with data:

CREATE TABLE `attach_data` (
  `id` mediumint(9) NOT NULL,
  `thedata` longblob NOT NULL,
  PRIMARY KEY (`id`),
  CONSTRAINT `fk_attach_data_id_attachments_attach_id` FOREIGN KEY (`id`) REFERENCES `attachments` (`attach_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1

For me is correctly transmuted to:

                      Table "public.attach_data"
 Column  |  Type   | Modifiers | Storage  | Stats target | Description
---------+---------+-----------+----------+--------------+-------------
 id      | integer | not null  | plain    |              |
 thedata | bytea   | not null  | extended |              |
Indexes:
    "attach_data_id_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
    "attach_data_id_fkey" FOREIGN KEY (id) REFERENCES attachments(attach_id)
Has OIDs: no

kworr avatar Mar 09 '14 18:03 kworr

I have already fix in #61 pull request with utf-8 encoding

clevertension avatar Jun 18 '14 04:06 clevertension