vortex icon indicating copy to clipboard operation
vortex copied to clipboard

feat: use VarBinView as the new canonical string type

Open a10y opened this issue 1 year ago • 5 comments

Supplants #476.

VarBinView is the new canonical representation for string types across the repo. There are still many places that natively use VarBin arrays internally, we can replace those over time.

  • Canonical::VarBin -> Canonical::VarBinView
  • FSSTArray, ConstArray, DictArray now all canonicalize into VarBinView
  • Updated the TPC-H setup to use Utf8View schemas

a10y avatar Sep 05 '24 21:09 a10y

I think https://github.com/apache/arrow-rs/issues/6366 is going to make python tests fail

robert3005 avatar Sep 06 '24 12:09 robert3005

Yea, even bumping PyArrow from 15 -> 17 (latest) did not seem to change that

a10y avatar Sep 06 '24 18:09 a10y

Blocked on https://github.com/apache/arrow-rs/pull/6368

a10y avatar Sep 06 '24 21:09 a10y

Converting back to draft while this is blocked

a10y avatar Oct 04 '24 14:10 a10y

I think with arrow 53.1.0 this is no longer blocked

robert3005 avatar Oct 06 '24 23:10 robert3005

image

Time for a take3 PR 🥲

a10y avatar Oct 17 '24 14:10 a10y

git cli is a lot smarter than github ui but probably still hard

robert3005 avatar Oct 17 '24 14:10 robert3005