arrow icon indicating copy to clipboard operation
arrow copied to clipboard

[Ruby] Arrow::Table.new infers nested integer arrays as utf8 when all values are non-negative

Open hypsakata opened this issue 2 weeks ago • 2 comments

Describe the bug, including details regarding any error messages, version, and platform.

When creating an Arrow::Table from a Ruby Hash, if a column contains nested arrays consisting solely of non-negative Integer values, the column is incorrectly inferred as string (utf8) instead of a list of integers.

However, if a negative integer is present in the data, the column is correctly inferred as a list type.

Analysis (suspected root cause)

It appears the issue lies within red-arrow/lib/arrow/array-builder.rb. detect_builder_info() returns UIntArrayBuilder with detected: false for non-negative Integers (presumably to allow upgrading to a signed type if a negative value appears later).

In the case of Arrays, a ListArrayBuilder seems to be constructed only when sub_builder_info[:detected] is true. Consequently, nested arrays containing only non-negative integers fail to produce a list type, causing the column to fall back to string (utf8).

Steps to reproduce the bug

require "arrow"

# Case 1: Only non-negative integers (Bug)
p Arrow::Table.new({ id: [1, 2], values: [[0, 1, 2], [3, 4]] }).schema
# Actual: values is inferred as string (utf8)
# Output:
# #<Arrow::Schema:... id: uint8 
# values: string>
require "arrow"

# Case 2: Contains a negative integer (Works as expected)
p Arrow::Table.new({ id: [1, 2], values: [[0, -1, 2], [3, 4]] }).schema
# Actual: values is inferred as list<int8>
# Output:
# #<Arrow::Schema:... id: uint8
# values: list<item: int8>>

Expected behavior

values should be inferred as a list of integers (e.g. list<item: int*>), not string, even when all integers are non-negative. (The exact integer bit width may vary.)

Actual behavior

When all integers are non-negative, values is inferred as string (utf8). Adding a negative integer results in the correct list type inference.

Environment

  • OS: macOS 26.1
  • CPU arch: Apple M4 Pro
  • Ruby: 3.4.7
  • Gems: red-arrow 22.0.0
  • Arrow installation method: Homebrew

Component(s)

Ruby

hypsakata avatar Dec 13 '25 23:12 hypsakata

Good catch!

Do you want to open a PR for this?

kou avatar Dec 14 '25 01:12 kou

Thanks! Yes, I'd like to open a PR for this.

hypsakata avatar Dec 14 '25 06:12 hypsakata

Issue resolved by pull request 48584 https://github.com/apache/arrow/pull/48584

kou avatar Dec 20 '25 12:12 kou