fst icon indicating copy to clipboard operation
fst copied to clipboard

A range can be specified with read.fst on sorted data frames

Open MarcusKlik opened this issue 8 years ago • 0 comments

When a sorted data set is stored as a fst binary file, sorting metadata is stored alongside the data. Using this metadata, a binary search can be performed on the key-columns before actually reading the data. For example, only 32 random seeks are needed in the binary file to search 4 billion rows for a begin- and end- value from a selected range. The performance penalty will be very small (seeking with modern SSD's is very fast).

MarcusKlik avatar Feb 01 '17 20:02 MarcusKlik