mlem icon indicating copy to clipboard operation
mlem copied to clipboard

Consider future of the "data" part of MLEM

Open aguschin opened this issue 3 years ago • 0 comments

Once "model" part of MLEM is more mature, we'll need to consider how to develop its "data" part.

Posting @daavoo feedback on this:

Why MLEM handles datasets? What are the benefits of handling data via MLEM?

IMO, appears to be a conflict of responsibilities between MLEM and ML framework (usually in charge of loading data).

How does the mlem.load / mleam.save (which focuses on primitives, DataFrame, Array, Tensor) integrate with complex ML Framework data loading? Huggingface dataset, Pytorch lightning data module, Tensorflow datasets.

How to handle things that don’t fit into memory?

aguschin avatar Aug 17 '22 08:08 aguschin