ComStock icon indicating copy to clipboard operation
ComStock copied to clipboard

Optimize the memory footprint during postprocessing

Open wenyikuang opened this issue 1 year ago • 1 comments

Why?

Right now it takes more than 200 G memory to run the sightglasspostprocessing in generate_metadata. Which cause it buggy and painful to run. And in the long term the data size will grow in O(n) if we load the whole thing into memory and do editorial.

Which is not nessasray.

How? Probably by the lazy load offer from polars. Should probably need: Prune the logic to a MVP protocal, then rewrite the indexing/load logic, add the feature back.

Restriction:

  • Don't import huge change and make the tech stack shift.

When: Before next release

Target: hopefully next release we could use a normal PC (~100G Ram) to finish the work

wenyikuang avatar Mar 22 '24 22:03 wenyikuang

Ideally if each upgrade can be processed sequentially, it can use < 32GB of RAM per upgrade and therefore be run on anyone on the team's machines.

asparke2 avatar Apr 09 '24 21:04 asparke2