Antoine Pitrou
Antoine Pitrou
@shanhuuang Do you want to take a look here?
Same problem as #14347: this is basically adding a partial, incomplete guard for a condition the caller is supposed to check for themselves.
If we wanted to check that the ExecBatch values correspond to the Schema, we should check for schema equality on each of the Datum values. That can unfortunately be expensive.
> Another reason is that developers of large apps would generally prefer an error the app can handle, and perhaps raise to its user, over crashing the app. I agree...
Edit: if this avoids an immediate crash and allows you to see `Validate` failing afterwards, then I would be ok with this (but let's not silently truncate output columns either).
@westonpace Are the failures mentioned above (mismatching schema size) legitimate?
Ok, it looks like Acero relies on being able to silently truncate the number of fields in that method. Which is quite unfortunate.
> @pitrou, please let me know which checks to remove from the test. In the interest of moving this forward before 10.0.0, I pushed some changes myself.
CI passed on @rtpsw 's fork.
Thanks for the PR @mapleFU . I haven't looked in detail yet. IMHO The testing approach should be to add reference files in https://github.com/apache/parquet-testing/tree/master/data . Ideally these files should provide...