Weston Pace
Weston Pace
If we can satisfy a filter with a scalar index then we should be able to implement a fast path for `count_rows` where we simply return the size of the...
This method takes in a filename / path and synchronously creates the fragment metadata for it. We could potentially read the file footer to determine the version, this would be...
Ngram indices are indices that can speed up various string filters. To start with they will be able to speed up `contains(col, 'substr')` filters. They work by creating a bitmap...
Previously we moved all FSL handling out into the rep/def levels. Now we're back at it again, but this time we've put the FSL handling back into the compressors. It...
If a string column has a FTS index then we should have enough information to speed up a variety of string-based filters. Here is a (currently very partial as I...
Currently this meta-column can't be used in filters and "delete by id" is a common enough operation that it would be nice to support. Also, because the row id doesn't...
Example scenario: A document is chunked into paragraphs and each paragraph is embedded and the row contains the document_id and the paragraph_id. Later, the user recalculates the embedding for one...
This just captures some very basic statistics. I'd like to eventually add bytes read, decode time, and time waiting on I/O to the mix. However, those will need to wait...
Inverted indices should be able to speedup filters like: ``` contains(my_str_col, 'dogs') ```