MySqlConnector
MySqlConnector copied to clipboard
System.FormatException: Couldn't interpret '1899-00-00 00:00:00' as a valid DateTime during MySqlConnection.GetSchemaAsync("TABLES", ...)
Software versions MySqlConnector version: 2.1.10 (nuget latest stable) Server type (MySQL, MariaDB, Aurora, etc.) and version: 8.0.19 .NET version: .NET 6.0 (6.0.300) (Optional) ORM NuGet packages and versions: N/A
Describe the bug When issuing .GetSchemaAsync("TABLES", ...) on a MySqlConnection instance, I am getting the aforementioned exception. I am using GetSchemaAsync("TABLES", ...) to retrieve a list of tables as part of a bulk-import ETL process. Somehow this particular database has some invalid DateTime representations in the schema itself? I have ConvertZeroDateTime set to true. AllowZeroDateTime was undefined/default but I also tried setting that to true which then yielded a different error relating to it being converted from MySqlDateTime to DateTime during the transformation to a schema DataTable.
Additional edge case checking may be needed for garbage DateTimes.
Exception
System.FormatException: Couldn't interpret '1899-00-00 00:00:00' as a valid DateTime
---> System.ArgumentOutOfRangeException: Year, Month, and Day parameters describe an un-representable DateTime.
at System.DateTime..ctor(Int32 year, Int32 month, Int32 day, Int32 hour, Int32 minute, Int32 second, Int32 millisecond, DateTimeKind kind)
at MySqlConnector.Core.Row.ParseDateTime(ReadOnlySpan`1 value) in /_/src/MySqlConnector/Core/Row.cs:line 485
--- End of inner exception stack trace ---
at MySqlConnector.Core.Row.ParseDateTime(ReadOnlySpan`1 value) in /_/src/MySqlConnector/Core/Row.cs:line 556
at MySqlConnector.Core.TextRow.GetValueCore(ReadOnlySpan`1 data, ColumnDefinitionPayload columnDefinition) in /_/src/MySqlConnector/Core/TextRow.cs:line 90
at MySqlConnector.Core.Row.GetValue(Int32 ordinal) in /_/src/MySqlConnector/Core/Row.cs:line 38
at MySqlConnector.Core.Row.GetValues(Object[] values) in /_/src/MySqlConnector/Core/Row.cs:line 457
at MySqlConnector.Core.SchemaProvider.FillDataTableAsync(IOBehavior ioBehavior, DataTable dataTable, String tableName, CancellationToken cancellationToken) in /_/src/MySqlConnector/Core/SchemaProvider.cs:line 996
at MySqlConnector.Core.SchemaProvider.FillTables(IOBehavior ioBehavior, DataTable dataTable, CancellationToken cancellationToken) in /_/src/MySqlConnector/Core/SchemaProvider.cs:line 864
at MySqlConnector.Core.SchemaProvider.GetSchemaAsync(IOBehavior ioBehavior, String collectionName, CancellationToken cancellationToken) in /_/src/MySqlConnector/Core/SchemaProvider.cs:line 54
at MySQLImporter.Console.ImportService.RunAsync(CancellationToken stoppingToken) in C:\Users\Travis.Gutjahr\source\repos\OMITTED\Implementations\CustomTools\MySQLImporter.Console\MySQLImporter.Console\ImportService.cs:line 68
at MySQLImporter.Console.HostedServiceBase.ExecuteAsync(CancellationToken stoppingToken) in C:\Users\Travis.Gutjahr\source\repos\OMITTED\Implementations\CustomTools\MySQLImporter.Console\MySQLImporter.Console\HostedServiceBase.cs:line 33
Code sample
IList<string?> tableNames;
using (MySqlConnection srcSqlConn = new MySqlConnection(sourceDatabaseOptions.ConnectionString))
{
logger.LogDebug("Connecting to source database...");
await srcSqlConn.OpenAsync(stoppingToken);
logger.LogDebug("Retrieving a list of tables from the source database...");
using (DataTable schemaTable = await srcSqlConn.GetSchemaAsync("TABLES", stoppingToken))
tableNames = schemaTable.Rows.Cast<DataRow>().Where(a => string.Equals(sourceDatabaseOptions.Database, a["TABLE_SCHEMA"]?.ToString(), StringComparison.OrdinalIgnoreCase)).Select(a => a["TABLE_NAME"]?.ToString()).ToList();
}
Expected behavior .GetSchemaAsync("TABLES", ...) to complete without exception and proceed to the next line where schemaTable object is enumerated.
Additional context Not sure how you want to resolve this as this appears to be something specific to this data set and I am not sure I can reproduce this in a new database as it related to schema information. If you have any suggestions as to how you'd want this fixed I can either submit a PR or I can test your proposed fixed by grabbing a branch. Other databases work fine with my code but this database in particular is having a hard time due to this weird DateTime in the schema.
One thing to bear in mind is that since GetSchema and GetSchemaAsync return a DataTable, these DateTimes WILL be converted to System.DateTime and therefore whatever fix is implemented, it'll have to adhere to the range of System.DateTime and not MySqlDateTime.
That forked commit is just what I did to workaround it. I didn't submit it as a pull request since I failed to read the contributing guide prior to making that change. I am also not certain that this is the best way to fix this issue.
Perhaps setting AllowZeroDateTime to true so it returns from there as a MySqlDateTime and then adding logic to the IConvertable logic so that when it inevitably translates to DateTime for the GetSchema DataTable it can handle out of range values? Just kind of thinking out loud.
Not sure how you want to resolve this as this appears to be something specific to this data set and I am not sure I can reproduce this in a new database as it related to schema information.
Now that you've got GetSchemaAsync("TABLES")
working, can you look at the output to see where DateTime.MinValue
(i.e., 1/1/0001 12:00:00 AM) is showing up? That might provide some insight on how to reproduce this bug.
I'd like to have a reproducible test case first, to (a) understand why it's happening and that the fix is appropriate, and (b) add an integration/regression test for it.
if AllowZeroDateTime is false
- SchemaProvider class is initialized and GetSchemaAsync method is called on it.
- fillAction (FillTables method) looked up from m_schemaCollections ({ "Tables", FillTables },) and called.
- DataTable columns are added here with typeof(DateTime) which is expected.
- FillDataTableAsync is called which issues the query containing the offensive column data
- "SELECT TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE, ENGINE, VERSION, ROW_FORMAT, TABLE_ROWS, AVG_ROW_LENGTH, DATA_LENGTH, MAX_DATA_LENGTH, INDEX_LENGTH, DATA_FREE, AUTO_INCREMENT, CREATE_TIME, UPDATE_TIME, CHECK_TIME, TABLE_COLLATION, CHECKSUM, CREATE_OPTIONS, TABLE_COMMENT FROM INFORMATION_SCHEMA.TABLES;"
- MySqlDataReader's GetValues is called.
- GetValue is called for each column.
- GetValueCore is called.
- Switch statement case ColumnType.DateTime: is hit and ParseDateTime method is called.
- The catch statement in ParseDateTime is hit at line 550 (master branch) after executing the return on line 547 due to the month and day both being 0, unrepresentable in System.DateTime.
else if AllowZeroDateTime is true
...same until after MySqlDataReader's GetValues call completes. Error occurs on "dataTable.Rows.Add(rowValues);" which performs an implicit IConvertable conversion of MySqlDateTime to System.DateTime and errors correctly due to IsValidDateTime being false. Line 201 of MySqlDateTime.cs is the line it hits and errors at with InvalidCastException. "DateTime IConvertible.ToDateTime(IFormatProvider? provider) => IsValidDateTime ? GetDateTime() : throw new InvalidCastException();"
ParseDateTime's ReadOnlySpan
new ReadOnlySpan<byte>[19] { 49, 56, 57, 57, 45, 48, 48, 45, 48, 48, 32, 48, 48, 58, 48, 48, 58, 48, 48 }
which translates to "1899-00-00 00:00:00" as UTF8. Valid MySqlDateTime but not value System.DateTime.
As an alternative to the fix I made, this could also be fixed if we set AllowZeroDateTime to true and then adjust line 201 of MySqlDateTime's IConvertible.ToDateTime to return DateTime.MinValue if IsValidDateTime is false instead of throwing InvalidCastException. That could introduce bugs in any consumer that expects that exception to be thrown however.
I do not like the alternative fix route mentioned above because for the regular querying of data on this connection, I do not want AllowZeroDateTime to be true and this error is happening as a result of GetSchema's expectation and requirement that the column be in DateTime.
Ultimately I believe the best fix is something akin to your line 497 of Row.cs (ParseDateTime method) which is... if (year == 0 && month == 0 && day == 0) which is basically what I had implemented but with the additional constraint that it only perform my logic if it is a schema information request. We could implement it for all ParseDateTime runs. if the year, month, or day is out of range for System.DateTime AND AllowZeroDateTime is false (meaning we don't allow out of range values) AND ConvertZeroDateTime is true, we return DateTime.MinValue. The ONLY reason I didn't just do that is the potential to break user-code if someone expects an error to be thrown as opposed to DateTime.MinValue being returned but line 500 already returns DateTime.MinValue if year, month, and day are all 0 so...
The best fix may just be...
// If we're converting zero or invalid DateTimes (ConvertZeroDateTime) but not allowing MySqlDateTime objects (AllowZeroDateTime), then we need to make sure we're within range. If we're not, return DateTime.MinValue. See https://github.com/mysql-net/MySqlConnector/issues/1180
if (Connection.ConvertZeroDateTime && !Connection.AllowZeroDateTime && (year < DateTime.MinValue.Year || year > DateTime.MaxValue.Year || month < DateTime.MinValue.Month || month > DateTime.MaxValue.Month || day < DateTime.MinValue.Day || day > DateTime.MaxValue.Day))
return DateTime.MinValue;
Side note: most databases on this MySQL server have a valid UPDATE_TIME, either NULL or an actual value ('2020-01-21 05:45:28') that isn't a date before computers as we know them exists. One specific database has the offending information in INFORMATION_SCHEMA's UPDATE_TIME but since the dates are consistent and well-formed, I don't suspect corruption, just... MySQL weirdness.
One specific database has the offending information in INFORMATION_SCHEMA's UPDATE_TIME but since the dates are consistent and well-formed, I don't suspect corruption, just... MySQL weirdness.
Thanks; this was the information I was looking for.
Curiously, googling for mysql "1899-00-00 00:00:00"
turns up just this issue, no it's not a widespread problem in MySQL Server as far as I can tell.
Agreed, I can't say I've ever encountered this until now.
This run failed with the following exception:
[xUnit.net 00:00:04.32] SideBySide.SchemaProviderTests.GetSchema(schemaName: "Partitions") [FAIL]
Skipped SideBySide.CommandTimeoutTests.CommandTimeoutResetsOnReadSync [1 ms]
Failed SideBySide.SchemaProviderTests.GetSchema(schemaName: "Partitions") [1 s]
Error Message:
System.FormatException : Couldn't interpret '1899-00-�5 �5:�5:�5' as a valid DateTime
Stack Trace:
at MySqlConnector.Core.Row.ParseDateTime(ReadOnlySpan`1 value) in /_/src/MySqlConnector/Core/Row.cs:line 485
at MySqlConnector.Core.TextRow.GetValueCore(ReadOnlySpan`1 data, ColumnDefinitionPayload columnDefinition) in /_/src/MySqlConnector/Core/TextRow.cs:line 90
at MySqlConnector.Core.Row.GetValue(Int32 ordinal) in /_/src/MySqlConnector/Core/Row.cs:line 38
at MySqlConnector.Core.Row.GetValues(Object[] values) in /_/src/MySqlConnector/Core/Row.cs:line 457
at MySqlConnector.Core.SchemaProvider.FillDataTableAsync(IOBehavior ioBehavior, DataTable dataTable, String tableName, CancellationToken cancellationToken) in /_/src/MySqlConnector/Core/SchemaProvider.cs:line 997
at MySqlConnector.Core.SchemaProvider.FillPartitions(IOBehavior ioBehavior, DataTable dataTable, CancellationToken cancellationToken) in /_/src/MySqlConnector/Core/SchemaProvider.cs:line 403
at MySqlConnector.Core.SchemaProvider.GetSchemaAsync(IOBehavior ioBehavior, String collectionName, CancellationToken cancellationToken) in /_/src/MySqlConnector/Core/SchemaProvider.cs:line 55
at MySqlConnector.MySqlConnection.GetSchema(String collectionName) in /_/src/MySqlConnector/MySqlConnection.cs:line 596
at SideBySide.SchemaProviderTests.GetSchema(String schemaName) in D:\a\1\s\tests\SideBySide\SchemaProviderTests.cs:line 105
Did that error occur on my fork branch or on master with a data set you were able to find?
No, it seemed like a spurious build failure unrelated to the code that was pushed. So possibly some kind of bug in MySQL Server?
Another run failed (using MySQL 8.0.15 on Windows as the server):
[xUnit.net 00:00:04.79] SideBySide.SchemaProviderTests.GetSchema(schemaName: "Tables") [FAIL]
Failed SideBySide.SchemaProviderTests.GetSchema(schemaName: "Tables") [1 s]
Error Message:
System.FormatException : Couldn't interpret '1899-00-�5 �5:�5:�5' as a valid DateTime
Stack Trace:
at MySqlConnector.Core.Row.ParseDateTime(ReadOnlySpan`1 value) in /_/src/MySqlConnector/Core/Row.cs:line 556
at MySqlConnector.Core.TextRow.GetValueCore(ReadOnlySpan`1 data, ColumnDefinitionPayload columnDefinition) in /_/src/MySqlConnector/Core/TextRow.cs:line 90
at MySqlConnector.Core.Row.GetValue(Int32 ordinal) in /_/src/MySqlConnector/Core/Row.cs:line 38
at MySqlConnector.Core.Row.GetValues(Object[] values) in /_/src/MySqlConnector/Core/Row.cs:line 457
at MySqlConnector.Core.SchemaProvider.FillDataTableAsync(IOBehavior ioBehavior, DataTable dataTable, String tableName, CancellationToken cancellationToken) in /_/src/MySqlConnector/Core/SchemaProvider.cs:line 996
at MySqlConnector.Core.SchemaProvider.FillTables(IOBehavior ioBehavior, DataTable dataTable, CancellationToken cancellationToken) in /_/src/MySqlConnector/Core/SchemaProvider.cs:line 864
at MySqlConnector.Core.SchemaProvider.GetSchemaAsync(IOBehavior ioBehavior, String collectionName, CancellationToken cancellationToken) in /_/src/MySqlConnector/Core/SchemaProvider.cs:line 54
at MySqlConnector.MySqlConnection.GetSchema(String collectionName) in /_/src/MySqlConnector/MySqlConnection.cs:line 596
at SideBySide.SchemaProviderTests.GetSchema(String schemaName) in D:\a\1\s\tests\SideBySide\SchemaProviderTests.cs:line 107
Searching for "1899-00-É5 É5:É5:É5" finds a couple of hits in Google. É
is 0xC9
, which could possibly be 201 or -55 as a numeric value.
I have a theory.
In 8.0.15, the code to convert a date to a string is here: https://github.com/mysql/mysql-server/blob/mysql-8.0.15/mysys/my_time.cc#L1096-L1176
Note the comment:
This functions don't check that given MYSQL_TIME structure members are in valid range. If they are not, return value won't reflect any valid date either.
Specifically, if any value is -1
, then the following C++ code will print C9 35
, i.e., É5
:
temp = ltime->hour;
temp2 = temp / 10;
temp = temp - temp2 * 10;
*to++ = (char)('0' + (char)(temp2));
*to++ = (char)('0' + (char)(temp));
So why do you get 1890-00-00 00:00:00
? Because that code has changed in a newer version to clamp the output to 0
-100
:
- https://github.com/mysql/mysql-server/blob/mysql-8.0.30/mysys/my_time.cc#L1333-L1353 calls
format_two_digits
-
format_two_digits
performs a range check and sets out-of-range values to0
: https://github.com/mysql/mysql-server/blob/mysql-8.0.30/mysys/my_time.cc#L1231-L1242
It's my belief that somewhere in MySQL, there is still a bug that causes MYSQL_TIME
fields to be set to -1
. In older versions of MySQL, that prints 1899-00-É5 É5:É5:É5
. In newer versions, that prints 1899-00-00 00:00:00
. Either way, it's a server-side bug that's sending invalid data.
MySqlConnector could work around that bug and coerce these two values to a valid DateTime
, but unless there is a constant value that these bogus strings are meant to represent, I'm concerned that could result in silent data loss. Better to throw an exception and have the client retry to get the right data.
That sounds fine for general use. However during my use case the issue was occurring during .GetSchemaAsync("TABLES", ...) on the MySqlConnection. No amount of retrying will help. This is an ETL flow, we won't have the ability to upgrade or alter their configuration in any way, including getting them to update MySQL Server.
Perhaps a compromise of an option to transform bad data for GetSchema calls only? The schema DataTable returned by GetSchema calls in all ADO.NET implementations always seems to be DateTime so replacing them with MySqlDate class wouldn't work. I think it has to be fudged a bit here to be both compatible and also work.
Just double-checking: you don't have any way to create a repro for this problem; it's just something you observe with the schema for a few (?) tables in one particular database on one particular server?
I can repro it with this specific database but I can't share the database as it has customer's data. I could probably drop all other tables and truncate the table in question, leaving only the schema. I'll work on that today and see if the database still blows up. If it does, I'll supply that database.
Moved teams, no easily sharable sample dataset. Closing as not fixing.