cromwell icon indicating copy to clipboard operation
cromwell copied to clipboard

Data truncation: Data too long for column METADATA_KEY

Open microbioticajon opened this issue 3 years ago • 0 comments

Hi All,

Im running into an issue with my deployment of Cromwell 65. I am running scripts connecting to a local MySQL(also tested on MariaDB). Upon running a reasonably complex pipeline I am receiving a number of database errors:

java.sql.BatchUpdateException: Data truncation: Data too long for column 'METADATA_KEY' at row 6
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:78)
	at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
	at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:480)
	at com.mysql.cj.util.Util.handleNewInstance(Util.java:192)
	at com.mysql.cj.util.Util.getInstance(Util.java:167)
	at com.mysql.cj.util.Util.getInstance(Util.java:174)
	at com.mysql.cj.jdbc.exceptions.SQLError.createBatchUpdateException(SQLError.java:224)
	at com.mysql.cj.jdbc.ClientPreparedStatement.executeBatchedInserts(ClientPreparedStatement.java:755)
	at com.mysql.cj.jdbc.ClientPreparedStatement.executeBatchInternal(ClientPreparedStatement.java:426)
	at com.mysql.cj.jdbc.StatementImpl.executeBatch(StatementImpl.java:796)
	at com.zaxxer.hikari.pool.ProxyStatement.executeBatch(ProxyStatement.java:128)
	at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeBatch(HikariProxyPreparedStatement.java)
	at slick.jdbc.JdbcActionComponent$InsertActionComposerImpl$MultiInsertAction.$anonfun$run$18(JdbcActionComponent.scala:542)
	at slick.jdbc.JdbcBackend$SessionDef.withPreparedStatement(JdbcBackend.scala:425)
	at slick.jdbc.JdbcBackend$SessionDef.withPreparedStatement$(JdbcBackend.scala:420)
	at slick.jdbc.JdbcBackend$BaseSession.withPreparedStatement(JdbcBackend.scala:489)
	at slick.jdbc.JdbcActionComponent$InsertActionComposerImpl.preparedInsert(JdbcActionComponent.scala:511)
	at slick.jdbc.JdbcActionComponent$InsertActionComposerImpl$MultiInsertAction.run(JdbcActionComponent.scala:536)
	at slick.jdbc.JdbcActionComponent$SimpleJdbcProfileAction.run(JdbcActionComponent.scala:28)
	at slick.jdbc.JdbcActionComponent$SimpleJdbcProfileAction.run(JdbcActionComponent.scala:25)
	at slick.dbio.DBIOAction$$anon$1.$anonfun$run$1(DBIOAction.scala:186)
	at scala.collection.Iterator.foreach(Iterator.scala:943)
	at scala.collection.Iterator.foreach$(Iterator.scala:943)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
	at scala.collection.IterableLike.foreach(IterableLike.scala:74)
	at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
	at slick.dbio.DBIOAction$$anon$1.run(DBIOAction.scala:186)
	at slick.dbio.DBIOAction$$anon$1.run(DBIOAction.scala:183)
	at slick.dbio.SynchronousDatabaseAction$$anon$7.run(DBIOAction.scala:486)
	at slick.basic.BasicBackend$DatabaseDef$$anon$3.liftedTree1$1(BasicBackend.scala:276)
	at slick.basic.BasicBackend$DatabaseDef$$anon$3.run(BasicBackend.scala:276)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
	at java.base/java.lang.Thread.run(Thread.java:831)
Caused by: com.mysql.cj.jdbc.exceptions.MysqlDataTruncation: Data truncation: Data too long for column 'METADATA_KEY' at row 6
	at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:104)
	at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:953)
	at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1092)
	at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1040)
	at com.mysql.cj.jdbc.ClientPreparedStatement.executeLargeUpdate(ClientPreparedStatement.java:1350)
	at com.mysql.cj.jdbc.ClientPreparedStatement.executeBatchedInserts(ClientPreparedStatement.java:716)
	... 27 common frames omitted

Initially nothing was persisting to the database making debugging tricky, so I updated the database to increase the column size of METADATA_ENTRY.METADATA_KEY from varchar(255) to something silly like varchar(3000) and re-ran the offending script.

The culprit showed itself as:

SELECT length(METADATA_KEY) FROM cromwell.METADATA_ENTRY, x.METADATA_KEY xWHERE length(METADATA_KEY) = (SELECT max(length(METADATA_KEY)) FROM METADATA_ENTRY)

323, "inputs:batch_files:/mnt/data/cromwell-executions/build_bob_ep/40573452-6a92-4e26-8f0d-02bd980970b7/call-build_bob/build_bob/b6e60c8e-2d0a-4db7-8b70-b71cde217b30/call-building_taxonomy/building_taxonomy/c2e537ff-e231-4b51-a423-f750604dca7c/call-classify_f33ef23grwsg32fgv/inputs/2065711490/GTDB_GB_GCA_123456789.1.mask.fasta"

Basically, as the complexity of the workflows increases the potential length of the inputs increases and the limit of varchar(255) is exceeded. Going forward, this will not be our most complicated workflow so I expect to hit this more frequently.

So firstly, am I doing anything wrong?

Secondly would it be possible to increase the maximum size of the column METADATA_KEY that can accommodate increasing levels workflow complexity? I can do this post deployment using ansible but that feels a little bit messy.

(I have also posted this on your JIRA backlog as Im not sure which is the best place to raise this: https://broadworkbench.atlassian.net/browse/CROM-6721)

Best, Jon

microbioticajon avatar Oct 19 '21 16:10 microbioticajon