flow
flow copied to clipboard
Parquet Filtering
Parquet comes with very handy mechanism called "Column Statistics" which says for example what are the min/max values, total number of null values etc.
By reading those statistics we won't need to iterate through the entire parquet file when for example we are looking for a data from a specific time range or value range.