Robert Sung-wook Shin
Robert Sung-wook Shin
I have same problem on my tablets - Lenovo M10 plus 3rd gen which are installed Android 12.
I'm premium user too. @bmewburn If this is not in your concern, it's ok to close.
@SlyEcho First, sorry for lack knowledge of c/cpp. I tried transfer almost possible variables of context - `except a method and also with kv data`, to second destination context but...
It doesn't seem necessary to keep open duplicated issue. closing.
@SlyEcho Thank you for your code, I tried it with @abetlen's example. (Sorry for the mess.) Because I don't know cpp as [my previous mention](https://github.com/ggerganov/llama.cpp/issues/1054#issuecomment-1514592615), I'm not sure if I...
I tried dump kv directly. I'v seen following code is working, but I cannot understand why. Is there any difference between using kv_self.buf and this? ```cpp #include #include #include #include...
I hope see an example for @xaedes PR. By the way, by adding logits to @abetlen 's first example and also second then it seems working. I think @chrfalch 's...
It works great for me! Thank you @xaedes !
I found all gpu malloc call cudaFree except [`ggml_cuda_transform_tensor`](https://github.com/ggerganov/llama.cpp/blob/2b2646931bd2a2eb3e21c6f3733cc0e090b2e24b/ggml-cuda.cu#L790) in `ggml_cuda.cu` Is there reason to leave `qkv layers` in state of allocated?
@bfrasure What is TabNine? If it means code assistant application which you said, I don't use it.