TiBigData
TiBigData copied to clipboard
[BUG] error when reading tidb using tikv connector in hive, Cannot parse "0000-00-00 00:00:00"
Describe the bug
we when use MapReduce-TiDB-Connector to read data from tidb/tikv and write it to hive. In some cases, we meet below exception when processing timestamp data:
2024-03-27 19:09:11,532 WARN [main] io.tidb.bigdata.hive.TiDBRecordReader: Can not close session
2024-03-27 19:09:11,533 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: org.joda.time.IllegalFieldValueException: Cannot parse "0000-00-00 00:00:00": Value 0 for monthOfYear must be in the range [1,12]
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:355)
at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:208)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:194)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:476)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:352)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1772)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
Caused by: org.joda.time.IllegalFieldValueException: Cannot parse "0000-00-00 00:00:00": Value 0 for monthOfYear must be in the range [1,12]
at org.joda.time.field.FieldUtils.verifyValueBounds(FieldUtils.java:234)
at org.joda.time.chrono.BasicMonthOfYearDateTimeField.set(BasicMonthOfYearDateTimeField.java:299)
at org.joda.time.format.DateTimeParserBucket$SavedField.set(DateTimeParserBucket.java:568)
at org.joda.time.format.DateTimeParserBucket.computeMillis(DateTimeParserBucket.java:447)
at org.joda.time.format.DateTimeParserBucket.computeMillis(DateTimeParserBucket.java:411)
at org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:882)
at org.joda.time.DateTime.parse(DateTime.java:160)
at io.tidb.bigdata.tidb.types.Converter.strToDateTime(Converter.java:230)
at io.tidb.bigdata.tidb.types.TimestampType.getOriginDefaultValueNonNull(TimestampType.java:97)
at io.tidb.bigdata.tidb.types.TimestampType.getOriginDefaultValueNonNull(TimestampType.java:46)
at io.tidb.bigdata.tidb.types.DataType.getOriginDefaultValue(DataType.java:466)
at io.tidb.bigdata.tidb.meta.TiColumnInfo.getOriginDefaultValueAsByteString(TiColumnInfo.java:221)
at io.tidb.bigdata.tidb.meta.TiColumnInfo.toProto(TiColumnInfo.java:251)
at io.tidb.bigdata.tidb.meta.TiDAGRequest.buildScan(TiDAGRequest.java:390)
at io.tidb.bigdata.tidb.meta.TiDAGRequest.buildTableScan(TiDAGRequest.java:210)
at io.tidb.bigdata.tidb.operation.iterator.CoprocessorIterator.getRowIterator(CoprocessorIterator.java:91)
at io.tidb.bigdata.tidb.ClientSession.iterate(ClientSession.java:303)
at io.tidb.bigdata.tidb.RecordSetInternal.iterator(RecordSetInternal.java:117)
at io.tidb.bigdata.tidb.RecordSetInternal.cursor(RecordSetInternal.java:96)
at io.tidb.bigdata.hive.TiDBRecordReader.initCursor(TiDBRecordReader.java:92)
at io.tidb.bigdata.hive.TiDBRecordReader.next(TiDBRecordReader.java:104)
at io.tidb.bigdata.hive.TiDBRecordReader.next(TiDBRecordReader.java:45)
at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350)
... 13 more
And i found a similar issue in mysql jdbc driver read: https://github.com/tidb-incubator/TiBigData/issues/26
I write a unittest to reproduce this issue:
public class TiColumnInfoTest {
@Test
public void testToProto() {
TiColumnInfo columnInfo =
new TiColumnInfo(
1L,
"name",
0,
TimestampType.TIMESTAMP,
SchemaState.StatePublic,
"0000-00-00 00:00:00",
"0000-00-00 00:00:00",
"0000-00-00 00:00:00",
"timestamp",
1,
"",
false);
System.out.println(columnInfo.getOriginDefaultValueAsByteString());
}
}