static-frame
static-frame copied to clipboard
Optimize Yarn store exporters when underlying buses already have disk backing
Description
There is the opportunity to optimize Yarn
store exporters, if the buses making up the Yarn all are backed by disk in the same storage format as what is being exported, and the configs are uniform.
Example
b1 = sf.Bus.from_zip_npz("store1.npz.zip")
b2 = sf.Bus.from_zip_npz("store2.npz.zip")
yarn = sf.Yarn.from_buses((b1, b2), retain_labels=False)
# This is currently slower than it needs to be, as it must first read in a frame from npz
# then immediately written back out
yarn.to_zip_npz("both_stores.npz.zip")
My suggestion would be to have an optimized path which under certain conditions, could bypass the store client exporter logic, and replace it with some simple copy operations.
def _fast_exporter(self, fp):
os.makedirs(fp)
for bus in self._series.values:
with zipfile.ZipFile(bus._store._fp) as zf:
zf.extractall(fp)
Very interesting idea; thanks for the suggestion. I think this is certainly worth exploring.