fst icon indicating copy to clipboard operation
fst copied to clipboard

Will fst(s) be additive

Open 1beb opened this issue 8 years ago • 1 comments

Is it possible to append to an fst without having to load it (completely)?

1beb avatar Feb 13 '17 18:02 1beb

Thanks @1beb for the feature request. I'm planning on adding an fst.rbind method to the next version of the fst package. This method will only need to read some meta-data from the existing file, so appending will be very fast as per your request. Note however that fst uses a columnar binary file format. This means that added data will basically be stored as a separate chunk inside the 'fst' file format. This will have a marginal impact on performance when large chunks of data are appended. However, when many small chunks are added sequentially, the overall performance will suffer. A partial solution to this problem might be to define a fst.stream class (issue #15) which can be used to append data to an existing file through an internal buffer. When the number of chunks is known, you can also use a fst.lapply method to create a large on-disk data set from many smaller inputs (issue #18) (also to be developed) . This could also be done in parallel with a fst.parlapply method.

MarcusKlik avatar Feb 14 '17 09:02 MarcusKlik