Antoine Pitrou
Antoine Pitrou
If you're going to use SIMD lookups, then I would ask that it relies on xsimd instead of intrinsics.
> Well it's a speedup even without SIMD instructions. To put it in a oversimplified term each slot occupies 1 byte so with int64 we can already check 8 slots...
Can you post benchmark numbers on string values? While hashing integers is nice, I'm not sure that's a dominant use case.
Ok. I agree we could replace MemoTable with something better. However, we don't want to have a dependency on Abseil, so it will have to be reimplemented.
@SGZW Anyone can work on an issue without asking for permission. Feel free to experiment and submit a PR when ready.
Arrow has no notion of logical types. But, yes, making the Columnar format spec more readable would be useful.
Ok, unfortunately, Columnar.rst _does_ use the wording "logical type". Which is contradicted by the fact that there's no separate set of "physical types" (only layouts). The whole thing has always...
I agree it would be nice, at least as a synthetic table.
The "logical" vs. "physical" distinction is actually extra confusing, because nowadays we _do_ have logical types (aka semantic variations of existing types), but they are called... extension types.
I don't think so. They're extension types, not part of the columnar spec itself. You may instead add a `seealso` after the table to point to https://arrow.apache.org/docs/format/CanonicalExtensions.html