Create an @embedding decorator used for projection, KNN, etc.

Open cabreraalex opened this issue 2 years ago • 4 comments

Often times we just want to get an embedding from the input data, NOT the output data. Currently we only support embeddings that come with a model prediction.

It would be nice to have embeddings run just once on the input data.

One option is to have this be a @distill function that we annotate as an embedding....

We could also generalize this to annotations for projections too?

Mar 11 '23 00:03 cabreraalex

@xnought thoughts?

Mar 11 '23 00:03 cabreraalex

Hmm good thought. I like the idea with the distill functions.

Mar 11 '23 02:03 xnought

Yeah but for stuff that needs the full data to compute, we would need a different system.

Like tsne could not work on distill unless we allow people to define batch sizes in the distill decorator itself distill(batches=1) so it just gets run on the entire data.

Mar 11 '23 02:03 xnought

ah yeah I keep forgetting about the fns that need the whole dataset

Mar 11 '23 18:03 cabreraalex