DBIish
DBIish copied to clipboard
MySQL driver cannot handle bigint unsigned in `execute( )`
CREATE TABLE foo ( id bigint unsigned ) engine=InnoDB;
INSERT INTO foo ( id ) VALUES ( 18446744073709551615 ); -- 2^64 -1
This value can be SELECTed without any issues:
my $id = $connection.execute( 'SELECT id FROM foo' ).allrows( )[ 0 ][ 0 ];
say $id;
will print 18446744073709551615
But somehow cannot be used as execute()
argument:
$connection.execute( 'SELECT id FROM foo WHERE id = ?', $id );
Cannot unbox 64 bit wide bigint into native integer
in block at /home/bbkr/.raku/sources/A15CA24FF2518653680BE63142CB6661532CAA84 (DBDish::mysql::StatementHandle) line 119
in block at /home/bbkr/.raku/sources/A15CA24FF2518653680BE63142CB6661532CAA84 (DBDish::mysql::StatementHandle) line 114
in method execute at /home/bbkr/.raku/sources/A15CA24FF2518653680BE63142CB6661532CAA84 (DBDish::mysql::StatementHandle) line 111
in method execute at /home/bbkr/.raku/sources/B092781CCB9B75F51B0389A686A61C391F70F118 (DBDish::mysql::Connection) line 57
https://github.com/raku-community-modules/DBIish/blob/main/lib/DBDish/mysql/StatementHandle.rakumod#L119
This logic is not valid, int64 has one bit less and cannot store maximum unsigned value.
I'm not sure what is the best fix without performance penalty:
when Int {
$st = MYSQL_TYPE_LONGLONG;
$_ > 0 ?? Blob[uint64].new($_) !! Blob[int64].new($_);
}
?
I think this needs proper solution. Current one is doing wrong typization for HALF of available range based on very risky assumption.
Yes, most autoincremented IDs are small. But bigints are also used as compact way to store binary flags for example. Or geo data. Driver should not manipulate data type like that. This is error prone and may create weird side effects.
It's impossible to fix properly as MySQL doesn't have any way of handling numbers over 2**64 other than as a string (that I've found), so string needs to be the final fallback.
At best, we can improve 2^63 to 2^64.
So you prefer this coding?
if $_ > -2**63 and $_ < 2**63 {
$st = MYSQL_TYPE_LONGLONG;
Blob[int64].new($_);
} elsif $_ > 0 and $_ < 2**64 - 1 {
$st = MYSQL_TYPE_LONGLONG;
Blob[uint64].new($_);
} else {
.Str.encode;
}
Also, can you provide a test case where it shows a change? If this has a semantic impact, we should have a test to ensure the whole BIGINT range has similar semantics.
The round-trip test I added to t/24-mysql-large-value.t doesn't detect any difference between a string encoded number and a LONGLONG encoded number.
The issue refers to biggest MySQL numeric type that MySQL can handle. IMO this else
part is not test-coverable (or is it?) and simple if >0 then use uint64 else use int64
should be enough (and faster).
The functional difference is on data consumer side. One expects numeric type from numeric column. This value may get propagated for example incorrectly as string to JSON and crash some strong typed consumer of such payload. Round trip MySQL->MySQL does not show whole story.
In my particular case https://github.com/bbkr/UpRooted/blob/master/t/31-reader-mysql.rakutest#L42 - this hit me here when I was checking how dumped data is presented for consumption and was unable to truly test whole field range.
The issue refers to biggest MySQL numeric type that MySQL can handle. IMO this else part is not test-coverable (or is it?) and simple if >0 then use uint64 else use int64 should be enough (and faster).
Unfortunately this is not adequate. Raku Ints such as 2^80 do not fit into either uint64 or MYSQL_TYPE_LONGLONG, and need to be transmitted as a string. The MySQL numeric type can store much larger numbers than BigInt, and even if they couldn't this generating a suitable error message (not a complaint about unboxing) requires more logic.
The functional difference is on data consumer side. One expects numeric type from numeric column.
Agreed, and everything in the test I added all values are returned as a Rat as expected from a Numeric column.
If you can identify, and we code around a semantic difference, we need to add a DBIish test for it to ensure it doesn't change in the future.
In my particular case https://github.com/bbkr/UpRooted/blob/master/t/31-reader-mysql.rakutest#L42 - this hit me here when I was checking how dumped data is presented for consumption and was unable to truly test whole field range.
This appears to be a store and read round-trip? You'll get back an Int as the Raku datatype from a MySQL bigint column regardless of whether it's transmitted across the protocol as a fixed-length 8 bytes or as a Unicode encoded string.
OK, then the fastest way seems to be (also with corrected range edges):
if -2**63 <= $_ <= 2**63 -1 { # most stuff goes here
$st = MYSQL_TYPE_LONGLONG;
Blob[int64].new($_);
}
elsif 2**63 <= $_ <= 2**64 - 1 { # half of unsigned int range goes here and does not need costly String conversion
$st = MYSQL_TYPE_LONGLONG;
Blob[uint64].new($_);
} else { # super rare cases in real world use
.Str.encode;
}
That is mix of your and mine solution that handles most common cases cheaply, does not have string casting penalty for unsigned int top range and has fallback for overflows. Am I correct?
BTW: My project does round trip when writing data here: https://github.com/bbkr/UpRooted/blob/master/t/42-writer-mysql.rakutest#L28
So it is currently tested "up to int signed". I think your solution alone will work just fine, however with slight performance penalty mentioned in my previous post.
The code you suggest seems reasonable to me but it doesn't quite work.
In this branch: https://github.com/raku-community-modules/DBIish/commit/ed66c7e48651f2b18bab04bdbfd6a37baa2edb4f
We get this test result: https://github.com/raku-community-modules/DBIish/actions/runs/3541415105/jobs/5945651204#step:12:201
These values:
9223372036854775808
18446744073709551615
Come back as these values:
-1
-9223372036854775808
Weird, on logical level this fix should be correct (Blobs do not overflow). Haven't looked in the test itself yet (I'm still in my $dayjob).
Are you sure MYSQL_TYPE_LONGLONG handles those values? I'm reading it as an int64_t .
You're correct - it does not: https://dev.mysql.com/doc/c-api/8.0/en/c-api-prepared-statement-type-codes.html
So my fix proposal were based on wrong assumption, sorry about that. Looks like your version is the valid way to solve it and there will always be string casting performance penalty in upper bigint unsigned range.
No worries. I'll merge in the lower boundary issue you identified as soon as the tests pass.