jaybird
jaybird copied to clipboard
With UTF8, exceed size limit do not throw same Exception [JDBC354]
Submitted by: Chouteau Mathieu (chouteaum)
Attachments: TestEncodingFB.java
Use a preparedStatement with a parameter on a 5 characters column.
When you execute the query (select, update, delete or insert), you don't obtain the same result if the length of the parameter value is over 5 or over 20 : - Over 5 characters you obtain a FBSQLException - Over 20 characters you obtain a DataTruncation
If the value contain 11 accented characters, the DataTruncation Exception is thrown.
I have attached a JUnit test case.
Commented by: Chouteau Mathieu (chouteaum)
JUnit test case
Modified by: Chouteau Mathieu (chouteaum)
Attachment: TestEncodingFB.java [ 12521 ]
Commented by: Chouteau Mathieu (chouteaum)
To create the table wich is used by the JUnit test case :
CREATE TABLE TEST ( ID integer NOT NULL, CODE varchar(5), CONSTRAINT CONSTRAINT_NAME PRIMARY KEY (ID) );
Commented by: @mrotteveel
The observed behavior is caused by a check that doesn't take the bytes per character into account. It will only throw the DataTruncation exception when the value exceeds the storage length (nr chars * nr of bytes per char) instead of the nr of chars. I am not sure if I am going to fix this in 2.2 as this will probably be significantly rewritten in Jaybird 3.0.
The problem with solving this is that, for example for UTF8, simply counting (Java) characters won't do. For example, the string "abcd\uD83D\uDE03" is 6 Java chars long, the last two being a surrogate pair representing a single codepoint, but conversion to UTF8 will yield the byte representation of 5 characters/codepoints (the last being 😃).
In addition, older versions of Firebird did - intentionally - not perform character length checks for UNICODE_FSS, allowing you to store characters up to the storage size in bytes, not declared character size (though Jaybird will truncate this when selecting values).
I could maybe check if the encoding is UTF8 (and not UNICODE_FSS), and then, if the string length() is too long, check the codepoint count and if that exceeds the length, throw a DataTruncation error as well. However, this would then result in different behaviour compared to setBytes.