airbyte icon indicating copy to clipboard operation
airbyte copied to clipboard

Source MySQL: java.sql.SQLException: YEAR

Open toandm opened this issue 2 years ago • 3 comments

Environment

  • Airbyte version: 0.40.0-alpha
  • OS Version / Instance: debian-11-bullseye-v20220719, GCP n1-standard-4
  • Deployment: Docker env
  • Source Connector and version: airbyte/source-mysql:0.6.12
  • Destination Connector and version: airbyte/destination-bigquery:1.1.14
  • Step where error happened: Sync job

Current Behavior

I use MySQL connector to connect to customer's MariaDB - version 10.6.8-MariaDB. Hosted on AWS RDS Syncs happen normally. One table has the column of type date. MariaDB supports invalid values for date column. This column has some invalid date values like 0000-00-00 , 0000-12-19. While invalid value with all zero ('0000-00-00') works, invalid value with all zero in year but other numbers in month, day (i.e. '0000-12-19') fails. Details are in the logs attached below. This date column is recognized as "String" in Airbyte UI

Expected Behavior

Allow all value, both valid and invalid to be synced successfully

Actions attempted to resolve issue

Tried adding these params to JDBC string but still same error:

  • zeroDateTimeBehavior=CONVERT_TO_NULL
  • zeroDateTimeBehavior=ROUND
  • zeroDateTimeBehavior=convertToNull
  • noDatetimeStringSync=true

Logs

logs-232.txt

Steps to Reproduce

  1. In MariaDB create a table with date column, and insert invalid value of 0000-12-19
  2. On Airbyte, create a new MySQL connector connecting with above DB
  3. Select the table above and start sync

Are you willing to submit a PR?

I don't have Java knowledge, unfortunately, so I cannot

toandm avatar Sep 11 '22 23:09 toandm

@toandm it seems like you made an issue similar to this one a few weeks ago: #16159. Is this the same error?

Did you recently upgrade the MySQL connector and have you tried using an older version?

sajarin avatar Sep 13 '22 15:09 sajarin

Hi @sajarin , yes that's the same issue. I closed it because I thought the issue was not because of invalid date. However, upon further testing, I realize that the root cause is indeed because of invalid date, though only some forms of invalid date cause this error, as I mentioned in my post above.

toandm avatar Sep 14 '22 00:09 toandm

I just re-ran the job with the latest MySQL connector version 0.6.12, but same error.

toandm avatar Sep 14 '22 00:09 toandm

This has been reported before for MySQL: https://bugs.mysql.com/bug.php?id=94872. Looks like MySQL does support invalid dates whereas MariaDB does.

Have you looked into the ALLOW_INVALID_DATES mode for MySQL?

https://dev.mysql.com/doc/refman/8.0/en/sql-mode.html#sqlmode_allow_invalid_dates

sajarin avatar Sep 14 '22 16:09 sajarin

Hi @sajarin , thank you for the reply.

Looks like MySQL does support invalid dates whereas MariaDB does.

I tested it out with a MySQL instance on AWS. It does allow inserting invalid dates with zero for year.

image

Have you looked into the ALLOW_INVALID_DATES mode for MySQL?

Yes, I have read about this. This one is the server side mode for the MySQL/MariaDB database instance. I'm sure that this mode is set on my client's MariaDB database, as invalid dates are allowed to be inserted into the table.

toandm avatar Sep 15 '22 01:09 toandm

I found this when looking up information about invalid dates on the MySQL reference page: https://dev.mysql.com/doc/refman/5.6/en/using-date.html#:~:text=MySQL%20permits%20you%20to%20store%20a%20%E2%80%9Czero%E2%80%9D%20value%20of%20',0000%2D00%2D00'%20.

It seems like 0000-00-00 is a valid date but 0000-12-30 is not. 0000-00-00 is reserved for "dummy" data. This may just be a limitation with the connector although MySQL does support invalid dates if you enable specific params (like the ones you mentioned in your post) so I'm not exactly sure why it's not working.

sajarin avatar Sep 15 '22 16:09 sajarin

Yeah, I just think that since MySQL supports both of these invalid forms, it makes sense for the connector to respect it and still output the values as normal. My guess is that the JDBC connector is hard-coded to accept all zero date, but does not handle the case where year is '0000'. Please somehow handle this, some important tables of my client have these invalid values and they are causing a lot of headaches.

toandm avatar Sep 16 '22 00:09 toandm

@toandm see discussion in https://github.com/airbytehq/airbyte/issues/8722 today this is a limitation of Airbyte connector. Hope the issue will be fixed soon.

marcosmarxm avatar Sep 23 '22 16:09 marcosmarxm

Related to this one. https://github.com/airbytehq/airbyte/issues/18609 If it's unreadable, set to NULL. Add tests remember CDC and non-CDC.

bleonard avatar Jan 11 '23 18:01 bleonard

Closing this as a duplicate of #18609.

rodireich avatar Apr 12 '23 05:04 rodireich