Martin Evans comments

Results 262 comments of


                                            Martin Evans

Failed to eval - potential bug but hard to diagnose

The method to reference for these error code is [this one](https://github.com/ggerganov/llama.cpp/blob/1fc2f265ff9377a37fd2c61eae9cd813a3491bea/llama.cpp#L6624). Interestingly the docs on that method say: > // return 0 on success > // return positive int on...

Failed to eval - potential bug but hard to diagnose

> `if (n_tested >= n_ctx) {` Yep I'm thinking it's this too, this is basically as far as I got in my digging. > logging... As far as I can...

LoadState() not restoring context when using CUDA backend?

@elgatopanzon thanks for reporting that, saves us a lot of work investigating it! Do you happen to have a link to an upstream issue tracking this bug?

Support cublas computation without requiring CUDA installed

llama.cpp copies cudart, cublas and cublasLt64 into the release package. See here: https://github.com/ggerganov/llama.cpp/blob/master/.github/workflows/build.yml#L497

Support cublas computation without requiring CUDA installed

> but not yet integrated cleanly on the currrent Executor architecture. Just a note on that, the current executors will probably be replaced at some point when we swap over...

Support cublas computation without requiring CUDA installed

I've started #361 with an update for new llama.cpp binaries. I'm not sure I'll get time to investigate packaging up the cudart binaries myself, but if anyone wants to work...

Support cublas computation without requiring CUDA installed

Is it possible to redistribute both sets of runtimes in the nuget packages? That way there are no extra manual steps required. > I'd be happy to help with this,...

Support cublas computation without requiring CUDA installed

Thanks for looking into that

Support cublas computation without requiring CUDA installed

Over in #371 Onkitova investigated using cudart, which seems to work. However the files are huge, so we don't want to include them in this repo or in our cuda...

Unexpected behavior in ChatSession.ChatAsync methods

This kind of confusion with ChatSession/History/various executors is actually exactly what got me started contributing to LLamaSharp! If you're interested in making any PRs to improve to current behaviour (even...