databricks-sql-go icon indicating copy to clipboard operation
databricks-sql-go copied to clipboard

Error while scanning column of type `array<decimal(10,2)>` into Go type `[]float64`

Open esdrasbeleza opened this issue 1 year ago • 0 comments

I have a table that has a column AMOUNTS with a type array<decimal(10,2)>, which I expect to scan into a field type []float64. This is the test I wrote to reproduce it:

func (s *DatabricksRepositorySuite) Test_Float64_Conversion() {
	query := `SELECT AMOUNTS FROM components WHERE AMOUNTS IS NOT NULL AND array_size(AMOUNTS) > 0 LIMIT 1`
	var result []float64
	err := s.repository.db.QueryRow(query).Scan(&result)
	s.NoError(err)
}

When I run my query, I get this error:

sql: Scan error on column index 0, name "AMOUNTS": unsupported Scan, storing driver.Value type string into type *[]float64

Source

This error comes from the standard library in Go: the value for that field is provided by the SDK as a string, but the destination is []float64, so it fails. I also dove into Databricks' SDK source code and found the place [2] where the string is generated.

[1] Golang source - convertAssignRows [2] Databricks SDK - listValueContainer.Value

Workaround

I created a custom type for the fields I have that use arrays:

import (
	"database/sql/driver"
	"encoding/json"
	"errors"
)

type databricksArray[T any] struct {
	parsedValue []T
}

func (f *databricksArray[T]) Scan(value interface{}) error {
	switch typedValue := value.(type) {
	case []T:
		f.parsedValue = typedValue
		return nil
	case string:
		return f.parseString(typedValue)
	default:
		return errors.New("error while handling databricks type")
	}
}

func (f *databricksArray[T]) parseString(value string) error {
	// This is a workaround for Databricks returning float arrays as strings
	// Example: `[1.0, 2.0, 3.0]`
	// We need to convert it to an array
	// This should be removed if/when Databricks fixes this issue
	valueBytes := []byte(value)
	var parsedValue []T
	if err := json.Unmarshal(valueBytes, &parsedValue); err != nil {
		return err
	}
	f.parsedValue = parsedValue
	return nil
}

func (f *databricksArray[T]) Value() (driver.Value, error) {
	return f.parsedValue, nil
}

func (f *databricksArray[T]) Slice() []T {
	if f == nil {
		return nil
	}
	return f.parsedValue
}

So my field

Amount []float64 `db:"AMOUNTS"`

is now

Amount databricksArray[float64] `db:"AMOUNTS"`

and I call Amount.Slice() when I want to read its value.

esdrasbeleza avatar Apr 11 '24 11:04 esdrasbeleza