bigfile
bigfile copied to clipboard
A reproducible massively parallel IO library for hierarchical data
We have been told by Frontera admin that it is causing problems for Astrid, as multiple files are written in quick succession and this forces the file servers to sync,...
TODO: - [x] Allow overriding bigfile methods, and port existing uses to a C based POSIX backend. - [x] Add Python correspondance, and PyBackend, URLBackend. - [ ] Clean up...
These changes will allow us to use bigfile to do journals (e.g. record blackhole details per step). 1. RecordType = [ ( column name, dtype ) ] big_record_set(rt, void* record_buf,...
Offline RLE compression on a single Column RLEBLOB files are compressed with the run-length-encoding. It shall be type aware. There shall be some kind of dynamic range compensation for efficient...
The MPI write function shall be refactor such that it can create one blob file per writer task and dedicate the blob file to that writer task. We shall also...
Shall we do the following renaming in the API: ``` BigFile -> Group BigBlock -> Column BigData -> DataSet ```
bigfile shall provide a function to detect if NFS backend is used. The client is then free to die if multiple nodes are issuing writes. Later this can be enhanced...
Two data set sizes: 1. Big (1T rows) 2. Small (1G rows) Fix BytesPerFile: 1GB (per BlobFile) Varying number of nodes as a fraction of BlueWaters: Possibly just two is...
We need a helper method for this. Two versions, one for BigFile one for BigFileMPI.
@rainwoodman Hi Yu, Admins on TACC have raised an issue with our code using the`bigfile` python interface. Apparently, it invokes too many queries to the filesystem, similar to your comment...