datachain icon indicating copy to clipboard operation
datachain copied to clipboard

Make `.get_storage()` caching listing lazy

Open ilongin opened this issue 1 year ago • 0 comments

This is follow up for PR https://github.com/iterative/datachain/pull/294 where listing is saved (cached) inside .from_storage() method itself which means it's not lazy unlike all the other DataChain steps. We should think how to postpone caching / saving listing when all chain steps are applied.

Ideas:

  1. Make some generic step that accepts function (closure) in DatasetQuery
  2. Create .terminate() on DataChain class
  3. ...

ilongin avatar Aug 19 '24 16:08 ilongin