Steward Garcia comments

Results 92 comments of


                                            Steward Garcia

Server example not working - failing on Runtime Error unexpected EOF

Requantize your model to the lastest version, and update use the latest server example release

[Feature] Is there a way to get response with embeddings as input?

Download the latest llama.cpp code and compile it with cmake option `-DLLAMA_BUILD_SERVER=ON`. ### Embeddings First, run the server with `--embedding` option: ```bash server -m models/7B/ggml-model.bin --ctx_size 2048 --embedding ``` Run...

[Feature] Is there a way to get response with embeddings as input?

You refers to a convertion `embedding to text`?. You can generate a list of embeddings and compare them with your input text: Open AI API ```python # a: vector embedding...

[Feature] Is there a way to get response with embeddings as input?

This is the code is to perform a semantic text search: ```javascript const axios = require('axios'); let docs_chunks = [ { text: "Microsoft is a multinational technology company founded by...

[Feature] Is there a way to get response with embeddings as input?

@x4080 Go to the build folder: ```bash llama.cpp/build ``` On the build folder: ```bash cmake .. -DLLAMA_BUILD_SERVER=ON ``` Build it: ```bash cmake --build . --config Release ```

CUDA/OpenCL error, out of memory when reload.

It seems that `llama_free` is not releasing the memory used by the previously used weights.

[Research] Steering vectors

`--steering-source` and `--steering-layer`, are the parameters random or is there a way to know which is which? Trial and error?

server : improvements and maintenance

@ggerganov > Batched decoding endpoint? This option to generate multiple alternatives for the same prompt requires the ability to change the seed, and the truth is, I've been having a...

server : improvements and maintenance

In my opinion, most of these projects based on ggml have the characteristic of being very lightweight with few dependencies (headers library: httplib.h json.hpp stb_image.h and others), making them portable...

server : improvements and maintenance

I would suggest something like creating a small utility that performs the functionality we are interested in using C++ (porting it). Analyzing the Jinja2cpp library quickly, it has Boost as...