TiBigData icon indicating copy to clipboard operation
TiBigData copied to clipboard

[BUG] error when reading tidb using tikv connector in hive, Cannot parse "0000-00-00 00:00:00"

Open lfyzjck opened this issue 10 months ago • 0 comments

Describe the bug

we when use MapReduce-TiDB-Connector to read data from tidb/tikv and write it to hive. In some cases, we meet below exception when processing timestamp data:

2024-03-27 19:09:11,532 WARN [main] io.tidb.bigdata.hive.TiDBRecordReader: Can not close session
2024-03-27 19:09:11,533 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: org.joda.time.IllegalFieldValueException: Cannot parse "0000-00-00 00:00:00": Value 0 for monthOfYear must be in the range [1,12]
    at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
    at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
    at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:355)
    at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
    at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
    at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:208)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:194)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:476)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:352)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1772)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
Caused by: org.joda.time.IllegalFieldValueException: Cannot parse "0000-00-00 00:00:00": Value 0 for monthOfYear must be in the range [1,12]
    at org.joda.time.field.FieldUtils.verifyValueBounds(FieldUtils.java:234)
    at org.joda.time.chrono.BasicMonthOfYearDateTimeField.set(BasicMonthOfYearDateTimeField.java:299)
    at org.joda.time.format.DateTimeParserBucket$SavedField.set(DateTimeParserBucket.java:568)
    at org.joda.time.format.DateTimeParserBucket.computeMillis(DateTimeParserBucket.java:447)
    at org.joda.time.format.DateTimeParserBucket.computeMillis(DateTimeParserBucket.java:411)
    at org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:882)
    at org.joda.time.DateTime.parse(DateTime.java:160)
    at io.tidb.bigdata.tidb.types.Converter.strToDateTime(Converter.java:230)
    at io.tidb.bigdata.tidb.types.TimestampType.getOriginDefaultValueNonNull(TimestampType.java:97)
    at io.tidb.bigdata.tidb.types.TimestampType.getOriginDefaultValueNonNull(TimestampType.java:46)
    at io.tidb.bigdata.tidb.types.DataType.getOriginDefaultValue(DataType.java:466)
    at io.tidb.bigdata.tidb.meta.TiColumnInfo.getOriginDefaultValueAsByteString(TiColumnInfo.java:221)
    at io.tidb.bigdata.tidb.meta.TiColumnInfo.toProto(TiColumnInfo.java:251)
    at io.tidb.bigdata.tidb.meta.TiDAGRequest.buildScan(TiDAGRequest.java:390)
    at io.tidb.bigdata.tidb.meta.TiDAGRequest.buildTableScan(TiDAGRequest.java:210)
    at io.tidb.bigdata.tidb.operation.iterator.CoprocessorIterator.getRowIterator(CoprocessorIterator.java:91)
    at io.tidb.bigdata.tidb.ClientSession.iterate(ClientSession.java:303)
    at io.tidb.bigdata.tidb.RecordSetInternal.iterator(RecordSetInternal.java:117)
    at io.tidb.bigdata.tidb.RecordSetInternal.cursor(RecordSetInternal.java:96)
    at io.tidb.bigdata.hive.TiDBRecordReader.initCursor(TiDBRecordReader.java:92)
    at io.tidb.bigdata.hive.TiDBRecordReader.next(TiDBRecordReader.java:104)
    at io.tidb.bigdata.hive.TiDBRecordReader.next(TiDBRecordReader.java:45)
    at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350)
    ... 13 more

And i found a similar issue in mysql jdbc driver read: https://github.com/tidb-incubator/TiBigData/issues/26

I write a unittest to reproduce this issue:

public class TiColumnInfoTest {

  @Test
  public void testToProto() {
    TiColumnInfo columnInfo =
        new TiColumnInfo(
            1L,
            "name",
            0,
            TimestampType.TIMESTAMP,
            SchemaState.StatePublic,
            "0000-00-00 00:00:00",
            "0000-00-00 00:00:00",
            "0000-00-00 00:00:00",
            "timestamp",
            1,
            "",
            false);
    System.out.println(columnInfo.getOriginDefaultValueAsByteString());
  }
}

lfyzjck avatar Mar 28 '24 03:03 lfyzjck