iceberg
iceberg copied to clipboard
Iceberg is a table format for large, slow-moving tabular data
In the event that a database isn't defined the behavior falls back to using a hardcoded "default" database. Instead we should respect spark.catalog.currentDatabase. I have a simple pull request prepared...
In starting to look at working on Iceberg's schema evolution for ORC, the current test case is full of Avro's types/data structures. That doesn't work at all for ORC, because...
This is blocked on [ORC-305](https://issues.apache.org/jira/browse/ORC-305) and a release that contains it.
We need to support timestamps with timezone for ORC. This is blocked by [ORC-189](https://issues.apache.org/jira/browse/ORC-189) and a release that contains it.
The SparkOrcWriter defines converters that could easily throw exceptions when a required column has a null value, like this: ```java static class RequiredIntConverter implements Converter { public void addValue(int rowId,...
Distinct counts aren't very valuable to cost-based optimization because they can't be easily merged. They should be removed. As a replacement, look into storing HLL buffers if they aren't too...