frostdb icon indicating copy to clipboard operation
frostdb copied to clipboard

Prefilter granule without column value when possible with RE/NRE matcher

Open cyriltovena opened this issue 3 years ago • 2 comments

When the a column value is missing from the granule and we're filtering using RE/NRE matchers there's a chance we can already skip the granule.

For example if we're using {foo=~"bar"} and the foo dynamic label doesn't exist. This means for regexp matcher we actually just need to run it against empty.

see https://github.com/polarsignals/frostdb/blob/main/table.go#L1162-L1166

I suspect the code require more refactoring as this logic is spread across the codebase and it forces us to compile the regexp multiple times.

May be we could improve that interface to:

type Filter interface {
        Eval(r arrow.Record) (*Bitmap, error)
	Filter(rg dynparquet.DynamicRowGroup) (bool, error)
        PreFilter(min *parquet.Value, max *parquet.Value, bool) (bool,error)
}

cyriltovena avatar Jun 30 '22 06:06 cyriltovena

Precompiling the filter sounds good to me. I don’t feel strongly about having an interface as at least so far it is only used internally in the table constructs, so I’d be happy with just a struct that encapsulates the different compiled representations and pass around that struct instead of the various representations (at least right now we already pass two of those representations around everywhere).

brancz avatar Jun 30 '22 06:06 brancz

For completeness sake (since the main link above is outdated), these are the lines that were originally linked to: https://github.com/polarsignals/frostdb/blob/bde4295ee60a6f3e1202c287716c752a5ba18295/table.go#L1267-L1271

brancz avatar Jul 29 '22 12:07 brancz

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Jan 05 '24 01:01 github-actions[bot]