airbyte
airbyte copied to clipboard
Support MariaDB Source
Tell us about the new integration you’d like to have
I would like to sync data from my MariaDB instance (source).
┆Issue is synchronized with this Asana task by Unito
I spent an hour going back and forth with this without realising mariadb wasn't supported until now lol
This ought to be a priority, airbyte currently already supports mysql, should be a stone's throw away to get it working with mariadb?
It's certainly more popular than mysql, and in many cases preferred, at least according to my network.
User tried to use Mariadb with Mysql, it works for some tables but got errors for others.
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - Caused by: java.sql.SQLDataException: Value '3428724653' is outside of valid range for type java.lang.Integer
Look maximum integer value from Mariadb is higher than JAVA and PGSQL Java and PGsql max integer => 2147483647 (no complete sure about this stmt) Mariadb/matomo integer value is 3428724653
log from user trying to use MySQL source connector with MariaDB
2021-06-01 18:12:35 INFO (/tmp/workspace/20/1) WorkerRun(call):62 - Executing worker wrapper. Airbyte version: 0.24.4-alpha
2021-06-01 18:12:36 INFO (/tmp/workspace/20/1) TemporalAttemptExecution(get):111 - Executing worker wrapper. Airbyte version: 0.24.4-alpha
2021-06-01 18:12:36 INFO (/tmp/workspace/20/1) DefaultReplicationWorker(run):97 - start sync worker. job id: 20 attempt id: 1
2021-06-01 18:12:36 INFO (/tmp/workspace/20/1) DefaultReplicationWorker(run):106 - configured sync modes: {matomo.matomo_log_link_visit_action=full_refresh - append, matomo.matomo_log_profiling=full_refresh - append, matomo.matomo_log_conversion=full_refresh - append, matomo.matomo_log_visit=full_refresh - append, matomo.matomo_log_conversion_item=full_refresh - append, matomo.matomo_log_action=full_refresh - append}
2021-06-01 18:12:36 INFO (/tmp/workspace/20/1) DefaultAirbyteDestination(start):81 - Running destination...
2021-06-01 18:12:36 INFO (/tmp/workspace/20/1) LineGobbler(voidCall):69 - Checking if airbyte/destination-postgres:0.3.5 exists...
2021-06-01 18:12:36 INFO (/tmp/workspace/20/1) LineGobbler(voidCall):69 - airbyte/destination-postgres:0.3.5 was found locally.
2021-06-01 18:12:36 INFO (/tmp/workspace/20/1) DockerProcessFactory(create):111 - Preparing command: docker run --rm --init -i -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -w /data/20/1 --network host airbyte/destination-postgres:0.3.5 write --config destination_config.json --catalog destination_catalog.json
2021-06-01 18:12:36 INFO (/tmp/workspace/20/1) LineGobbler(voidCall):69 - Checking if airbyte/source-mysql:0.3.1 exists...
2021-06-01 18:12:36 INFO (/tmp/workspace/20/1) LineGobbler(voidCall):69 - airbyte/source-mysql:0.3.1 was found locally.
2021-06-01 18:12:36 INFO (/tmp/workspace/20/1) DockerProcessFactory(create):111 - Preparing command: docker run --rm --init -i -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -w /data/20/1 --network host airbyte/source-mysql:0.3.1 read --config source_config.json --catalog source_catalog.json --state input_state.json
2021-06-01 18:12:36 INFO (/tmp/workspace/20/1) DefaultReplicationWorker(run):132 - Waiting for source thread to join.
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - Exception in thread "main" java.lang.RuntimeException: java.sql.SQLDataException: Value '3428724653' is outside of valid range for type java.lang.Integer
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at io.airbyte.db.jdbc.JdbcUtils$1.tryAdvance(JdbcUtils.java:79)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at java.base/java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at io.airbyte.commons.util.DefaultAutoCloseableIterator.computeNext(DefaultAutoCloseableIterator.java:58)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at io.airbyte.commons.util.LazyAutoCloseableIterator.computeNext(LazyAutoCloseableIterator.java:62)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.google.common.collect.TransformedIterator.hasNext(TransformedIterator.java:42)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at io.airbyte.commons.util.DefaultAutoCloseableIterator.computeNext(DefaultAutoCloseableIterator.java:58)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.google.common.collect.TransformedIterator.hasNext(TransformedIterator.java:42)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at io.airbyte.commons.util.DefaultAutoCloseableIterator.computeNext(DefaultAutoCloseableIterator.java:58)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at io.airbyte.commons.util.CompositeIterator.computeNext(CompositeIterator.java:83)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at io.airbyte.commons.util.DefaultAutoCloseableIterator.computeNext(DefaultAutoCloseableIterator.java:58)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at java.base/java.util.Iterator.forEachRemaining(Iterator.java:132)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at io.airbyte.integrations.base.IntegrationRunner.run(IntegrationRunner.java:105)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at io.airbyte.integrations.source.mysql.MySqlSource.main(MySqlSource.java:308)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - Caused by: java.sql.SQLDataException: Value '3428724653' is outside of valid range for type java.lang.Integer
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:114)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:89)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:63)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:73)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:92)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.mysql.cj.jdbc.result.ResultSetImpl.getObject(ResultSetImpl.java:1393)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at com.mysql.cj.jdbc.result.ResultSetImpl.getInt(ResultSetImpl.java:797)
2021-06-01 18:12:37 ERROR (/tmp/workspace/20/1) LineGobbler(voidCall):69 - at org.apache.commons.dbcp2.DelegatingResultSet.getInt(DelegatingResultSet.java:623)
- investigate if there are any limitations specific to MariaDB that prevent us from using MySQL Source connector.
- Document the limitations if any are found
- Decide if we need a separate connector for MariaDB
Looks like fixing current MySQL source is much simpler than to create dedicated connector, like we did with MSSQL connector to work with SQL Azure
@kimerinn what bugs and limitations have you found in current MySQL connector that apply to MariaDB?
There are a hundreds of differences between latest MySQL and MariaDB: https://mariadb.com/kb/en/mariadb-vs-mysql-compatibility/
But on protocol level they are compatible. We could use the same Connector/J jdbc driver to connect both MySQL/MariaDB. General data types are also idenctical (error from the log shows that unsigned integer is treated as signed - currently I do not know why, detailed information needed): https://dev.mysql.com/doc/refman/8.0/en/data-types.html https://mariadb.com/kb/en/data-types/
Common SQL is also same. So, if we will not use specific SQL dialect features that differs in MySQL and MariaDB, we would be safe.
@kimerinn
- are there any changes that we need to make to MySQL connector to make it work with MariaDB in JDBC mode?
- are there any changes that we need to make to MySQL connector to make it work with MariaDB in CDC mode?
-
Theoretically, we should not make any changes. MySQL's Connector/J is compatible with MariaDB. "All MySQL connectors (PHP, Perl, Python, Java, .NET, MyODBC, Ruby, MySQL C connector etc) work unchanged with MariaDB." https://mariadb.com/kb/en/mariadb-vs-mysql-compatibility/
-
No. Though Debezium connector for MySQL has not tested against MariaDB but multiple reports from the community indicate successful usage of the connector with this database. Official support for MariaDB is planned for a future Debezium version. https://debezium.io/documentation/reference/stable/connectors/mysql.html https://stackoverflow.com/questions/50961788/debezium-mysql-mariadb-connector-how-to-resume-from-a-previous-bin-log-file This means, we should not make any changes to cdc
@kimerinn now that we have the E2E testing tool (cc @DoNotPanicUA ), can you please create an end to end test that validates the assumptions that Airbyte MySQL Source Connector works with MariaDB?
@grishick the E2E testing tool is in the first stage and requires some important updates before we can use it as a regular tool. We need at least the third stage from our roadmap or even four.
@grishick the E2E testing tool is in the first stage and requires some important updates before we can use it as a regular tool. We need at least the third stage from our roadmap or even four.
I understand that. I just ran the E2E tool locally to sync a simple table from MariaDB on AWS to Postgres on AWS and I see that it does not have all the features that would be needed to create scenarios that test full functionality of MySQL Connector. Can the two of you work together to add the missing features to the tool and to create the test scenarios. For the scope of this issue, I do not need the tests to be running regularly on CI. I am looking for the following:
- write up test scenarios in text (Google Doc is sufficient)
- create corresponding test scenario files for the E2E testing tool
- add missing features to the tool so that it can run the scenarios
- run the tests locally with MySQL and MariaDB sources and document any differences in results or in setup steps (I don't expect there will be any differences)
- create a repository for test scenarios and commit the scenarios to that repository, so that anyone can repeat the tests locally
I think this should be possible once the 3rd stage of the roadmap is done and having this as a trial test for the tool will help make sure that the tool is useful.
@grishick Actually, we need to populate only credential files for the local run. It makes sense to focus on getting credential files from our secret storage to simplify E2E testing tool usage at the current stage.
@kimerinn Only a few things are missing to run the test locally:
- Up MySql source on our AWS instance
- Put some test data to the MySql
- Up MariaDB on our AWS instance
- Prepare credential files for the E2E testing tool
@grishick @DoNotPanicUA Created appropriate task E2E Testing Tool: test MySQL source on MariaDB https://github.com/airbytehq/airbyte/issues/15836