llm
llm copied to clipboard
Make `InferenceSession` `Clone`-able
In one of my test applications, I use an InferenceSession
to load in a prompt that I later reuse. However, I realised while doing this that you can't actually clone an InferenceSession
in memory (and I think it should be possible?), so I had to serialize the session to a Vec<u8>
and rehydrate it when I needed to infer from it.
I think this should be easy enough to fix, but we should check that there aren't any weird assumptions that we're violating if we do so. (I assume this would also allocate another ctx
, but that should be fine)
Yup, I don't see any problems here (other than this just hasn't been implemented yet) :smile:
This might require some careful handling of the underlying ggml context. Make sure a new context is allocated and any tensor data is copied over to the new context. For this, you might need some C-like pointer fiddling and maybe expose a few more GGML functions. Simply cloning the pointers would result in the wrong behavior, and most likely UB. But from your question I think you already accounted for that :+1: