Andrei comments

Results 177 comments of


                                            Andrei

Improve installation process

> Prebuilt wheels with GPU support for all platforms (on Github or PyPI). According to my observations, installing llama-cpp-python with GPU support is the most popular problem when installing llama-cpp-python,...

explicit cast messages to string for RAG purposes

Hey @nivibilla can you provide a log of the messages that are sent that are causing this issue? The type hints should be correct there so any need to cast...

Improve installation process

@DvitryG yes source installation will always be the default (`pip install llama-cpp-python` etc) and it should offer the most control / performance.

feat: implement required attributes in json_schema_to_gbnf

@BDav24 I tried that example with the `jsonschema` package and it doesn't seem that `null` is valid as a stand-in for non-required fields.

Add beam search

@mattpulver great work here, I'll review this and should have it merged this week. Cheers

Add beam search

@mattpulver just a quick update, I'm going to hold of merging this until after #771 because that's going to have some big impact on how we use the llama.cpp api...

Add beam search

Hey @rishsriv I'm still planning to merge this however I'm currently grinding through the batch processing support first as it requires a bunch of internal refactoring, after that I was...

rwkv.cpp server

Thanks for the reply @saharNooby I'll see about putting it into a single file, right now it depends on 3 packages: fastapi (framework), sse_starlette (handle server-sent events), and uvicorn (server)....

Lanchain.js integration

@ansarizafar for server-side usage that would require runtime specific bindings (node-ffi / Deno FFI). One challenge is that you need to build the shared library on the processor you're targetting...

Enable detokenizing special tokens

Hey @benniekiss thank you for the contribution! I reverted the changes to `prev_token` in the `LlamaTokenizer` and changed special to default to False everywhere to avoid any possible breaking changes....