Martin Evans
                                            Martin Evans
                                        
                                    The method to reference for these error code is [this one](https://github.com/ggerganov/llama.cpp/blob/1fc2f265ff9377a37fd2c61eae9cd813a3491bea/llama.cpp#L6624). Interestingly the docs on that method say: > // return 0 on success > // return positive int on...
> `if (n_tested >= n_ctx) {` Yep I'm thinking it's this too, this is basically as far as I got in my digging. > logging... As far as I can...
@elgatopanzon thanks for reporting that, saves us a lot of work investigating it! Do you happen to have a link to an upstream issue tracking this bug?
llama.cpp copies cudart, cublas and cublasLt64 into the release package. See here: https://github.com/ggerganov/llama.cpp/blob/master/.github/workflows/build.yml#L497
> but not yet integrated cleanly on the currrent Executor architecture. Just a note on that, the current executors will probably be replaced at some point when we swap over...
I've started #361 with an update for new llama.cpp binaries. I'm not sure I'll get time to investigate packaging up the cudart binaries myself, but if anyone wants to work...
Is it possible to redistribute both sets of runtimes in the nuget packages? That way there are no extra manual steps required. > I'd be happy to help with this,...
Thanks for looking into that
Over in #371 Onkitova investigated using cudart, which seems to work. However the files are huge, so we don't want to include them in this repo or in our cuda...
This kind of confusion with ChatSession/History/various executors is actually exactly what got me started contributing to LLamaSharp! If you're interested in making any PRs to improve to current behaviour (even...