databricks-sql-go icon indicating copy to clipboard operation
databricks-sql-go copied to clipboard

Feat/enable native decimal support

Open nikhilmehta16 opened this issue 7 months ago • 0 comments

Summary

Enables native Arrow decimal type support by removing unnecessary blocking condition in ScanRow and enabling native decimal support by default. The decimal128Container infrastructure was already complete - these changes allow decimal columns to return native Arrow decimal128 types instead of raw bytes.

Changes

  • Enable native decimal by default: Set UseArrowNativeDecimal = true in ArrowConfig.WithDefaults()
  • Remove ScanRow blocking: Remove decimal type check from ScanRow error condition
  • Update test expectations: Allow decimal columns to scan successfully
  • Interval types remain blocked (still unsupported)

Why Native Arrow Decimal128?

This provides significant advantages over the previous raw bytes approach:

  • Native Arrow types: Arrow records now contain proper decimal128 types with precision/scale metadata
  • Type safety: Schema correctly reflects arrow.DECIMAL128 instead of arrow.STRING
  • No manual parsing: Direct access to decimal values without byte-level parsing
  • Precision preservation: Full decimal precision maintained through Arrow's native decimal128 format

Impact

Before:

  • UseArrowNativeDecimal = false by default
  • Decimal columns → raw bytes ([]byte) in Arrow records requiring manual parsing
  • Schema shows arrow.STRING type for decimal columns

After:

  • UseArrowNativeDecimal = true by default
  • Decimal columns → native arrow.Decimal128 types in Arrow records
  • Schema correctly shows arrow.DECIMAL128 with precision/scale metadata
  • Direct conversion to float64 available via decimal128Value.ToFloat64(scale)

For Direct Arrow Usage

Users working with Arrow records directly now get:

// Schema reflects true types
field.Type.ID() == arrow.DECIMAL128  // ✅ Instead of arrow.STRING

Configuration Note

The UseArrowNativeDecimal option is currently internal and not exposed through the public API. Users cannot currently override this setting, but the new default behavior provides proper Arrow type semantics.

Testing

✅ Updated test expectations to reflect new default behavior
✅ All tests pass locally
✅ Arrow records now contain native decimal128 types instead of strings

Breaking Change

This is a minor breaking change for users directly accessing Arrow records, as decimal columns now return arrow.Decimal128Array instead of arrow.StringArray. However, this provides the correct Arrow type semantics and eliminates the need for manual parsing.

nikhilmehta16 avatar Aug 01 '25 08:08 nikhilmehta16