ARROW-16778: [C++] Fix win32 build and tests
https://issues.apache.org/jira/browse/ARROW-16778
Looks like it still doesn't build on 32-bit Windows and there are tons of compiler warnings about casts (one of which is an error). We might want a general way to deal with int64_t -> size_t conversions.
Looks like it still doesn't build on 32-bit Windows and there are tons of compiler warnings about casts (one of which is an error). We might want a general way to deal with
int64_t -> size_tconversions.
I'm thinking to add a templated ToSizeT function to a utils header for converting integer types to size_t, that adds pragmas to suppress warnings on 32-bit platforms (or when going from a signed integer).
FWIW, an explicit static_cast is enough to suppress the warning. I'm more thinking: on 32-bit platforms, do we also want to check for potential overflow?
I would be fine with static_cast<size_t>. A 32-bit build cannot represent any in-memory containers larger than 2**32 elements (except perhaps a BooleanArray...).
Fair, though ChunkedArray and IPC may throw that off? I suppose we only need to check in certain places for that though
ChunkedArray is still in-memory. For other places (such as file access) we might want to check.
@lidavidm @pitrou , the warnings are now all fixed for the 32-bit MSVC build.
It is failing to build linking some parquet tests that depend on boost::filesystem (trying to link to the 64-bit version incorrectly). I did set ARROW_PARQUET to off in the github workflow though, so I'm not sure why it's even trying to build this test or the parquet code.
This build is also failing archery linting. Mostly around line wrapping. Is there an automated way of fixing the archery formatting warnings?
@jduo you can use Archery for autoformatting: https://arrow.apache.org/docs/developers/cpp/development.html#code-style-linting-and-ci
@github-actions autotune
@jduo can you rebase here? The autotune, weirdly enough, does not actually kick off CI.
@jduo it turns out ARROW_SUBSTRAIT implies ARROW_PARQUET so we should turn that off, too
Hmm, I suppose you could still have a NA array with > (edit: size_t) rows on 32-bit, since there's no actual storage…
Or actually. A boolean array could also technically exceed size_t rows, since there's 8 rows per byte?
While it is technically ok, I think sprinkling static_cast<size_t> everywhere is both cumbersome (on the code writing and reading sides) and will turn out fragile maintenance-wise.
I also don't think we're very interested in ensuring highest quality support for 32-bit Windows, so I would favor an approach that's much lighter on maintenance and code writing demands. (for example by selectively disabling some warnings on 32-bit MSVC)
While it is technically ok, I think sprinkling
static_cast<size_t>everywhere is both cumbersome (on the code writing and reading sides) and will turn out fragile maintenance-wise.I also don't think we're very interested in ensuring highest quality support for 32-bit Windows, so I would favor an approach that's much lighter on maintenance and code writing demands. (for example by selectively disabling some warnings on 32-bit MSVC)
The argument against disabling compilation warnings is that if the warnings come from a public header, it the library user inherits the warning and can't really work around it. They would need to disable the warning on their side if they report warnings as failures in their build system, which may be undesirable for example.
Well, we could selectively disable warnings in Arrow headers, but I fear that a lot of headers would probably be affected :-(
Perhaps this needs to be further discussed on the mailing-list. My opinion is that 32-bit support should not come with undue maintenance load.
Closing because it has been untouched for a while, in case it's still relevant feel free to reopen and move it forward 👍