llm Make `InferenceSession` `Clone`-able

Make `InferenceSession` `Clone`-able

Open philpax opened this issue 1 year ago • 1 comments

In one of my test applications, I use an InferenceSession to load in a prompt that I later reuse. However, I realised while doing this that you can't actually clone an InferenceSession in memory (and I think it should be possible?), so I had to serialize the session to a Vec<u8> and rehydrate it when I needed to infer from it.

I think this should be easy enough to fix, but we should check that there aren't any weird assumptions that we're violating if we do so. (I assume this would also allocate another ctx, but that should be fine)

Mar 21 '23 11:03 philpax

Yup, I don't see any problems here (other than this just hasn't been implemented yet) :smile:

This might require some careful handling of the underlying ggml context. Make sure a new context is allocated and any tensor data is copied over to the new context. For this, you might need some C-like pointer fiddling and maybe expose a few more GGML functions. Simply cloning the pointers would result in the wrong behavior, and most likely UB. But from your question I think you already accounted for that :+1:

Mar 21 '23 11:03 setzer22

llm llm copied to clipboard

Make `InferenceSession` `Clone`-able

llm
llm copied to clipboard