connector-x
connector-x copied to clipboard
MySQL TINYINT gets read as boolean into Arrow Dataframe
What language are you using?
Python
What version are you using?
The bug exists in the 0.3.1 and the current 0.3.2 dev version
What database are you using?
MySQL
What dataframe are you using?
Arrow
Can you describe your bug?
Columns using a TINYINT(4) type are cast as a boolean when read into an Arrow dataframe in version 0.3.1 and onwards. I don't know if this is happening across any other types. I have seen columns with large integers sometimes get cast as floats, though I don't have an example of this available.
CREATE TABLE `cx_test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`test_column` tinyint(4) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=0 DEFAULT CHARSET=latin1;
INSERT INTO `cx_test` (test_column)
VALUES (0);
INSERT INTO `cx_test` (test_column)
VALUES (1);
INSERT INTO `cx_test` (test_column)
VALUES (2);
INSERT INTO `cx_test` (test_column)
VALUES (3);
With Version 0.3.0, everything works correctly:
cx.read_sql(conn_str, "select * from cx_test", return_type="arrow")
pyarrow.Table
id: int64
test_column: int64
----
id: [[1,2,3,4]]
test_column: [[0,1,2,3]]
With Version 0.3.1 and onwards, the data is converted into bools:
cx.read_sql(conn_str, "select * from cx_test", return_type="arrow")
pyarrow.Table
id: int64
test_column: bool
----
id: [[1,2,3,4]]
test_column: [[false,true,true,true]]