tidb icon indicating copy to clipboard operation
tidb copied to clipboard

Lighting precheck shows incorrect estimate sorted data size

Open db-will opened this issue 1 year ago • 2 comments

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

Use lightning to import large enough data (>5TB or more) with phsyical import and precheck enabled.

The issue seems related to converting int64 to float64 during print: https://github.com/pingcap/tidb/blob/master/lightning/pkg/importer/precheck_impl.go#L587-L588

theResult.Message = fmt.Sprintf("local disk resources are rich, estimate sorted data size %s, local available is %s",
	units.BytesSize(float64(estimatedDataSizeWithIndex)), units.BytesSize(float64(localAvailable)))

https://github.com/pingcap/tidb/blob/master/lightning/pkg/importer/precheck_impl.go#L179-L180

theResult.Message += fmt.Sprintf("TiKV requires more storage space. Estimated required size: %s. Actual size: %s.",
	units.BytesSize(float64(tikvSourceSize)), units.BytesSize(float64(tikvAvail)))

Consider the imported data size is small, it could also relates to tikvSourceSize calculation incorrect and hence trigger such issue.

2. What did you expect to see? (Required)

The precheck log should show correct number for estimated data source size.

3. What did you see instead (Required)

local disk resources are rich, estimate sorted data size -3.074e+18B, local available is xxx(~700GB)
TiKV requires more storage space. Estimated required size: 8EiB. Actual size: xxx(~5TB).  

4. What is your TiDB version? (Required)

v7.5.1

db-will avatar Jun 26 '24 02:06 db-will