static-frame icon indicating copy to clipboard operation
static-frame copied to clipboard

Optimize Yarn store exporters when underlying buses already have disk backing

Open chaburkland opened this issue 3 years ago • 1 comments

Description

There is the opportunity to optimize Yarn store exporters, if the buses making up the Yarn all are backed by disk in the same storage format as what is being exported, and the configs are uniform.

Example

b1 = sf.Bus.from_zip_npz("store1.npz.zip")
b2 = sf.Bus.from_zip_npz("store2.npz.zip")

yarn = sf.Yarn.from_buses((b1, b2), retain_labels=False)

# This is currently slower than it needs to be, as it must first read in a frame from npz
# then immediately written back out
yarn.to_zip_npz("both_stores.npz.zip")

My suggestion would be to have an optimized path which under certain conditions, could bypass the store client exporter logic, and replace it with some simple copy operations.

def _fast_exporter(self, fp):
    os.makedirs(fp)

    for bus in self._series.values:
        with zipfile.ZipFile(bus._store._fp) as zf: 
            zf.extractall(fp) 

chaburkland avatar Jan 07 '22 04:01 chaburkland

Very interesting idea; thanks for the suggestion. I think this is certainly worth exploring.

flexatone avatar Jan 07 '22 16:01 flexatone