Philpax issues

Results 182 issues of


Philpax

Parallel loading of the model tensors

People have reported faster loading of the models in upstream when the tensors are loaded in parallel: https://github.com/ggerganov/llama.cpp/issues/85 This should be pretty easy to do with Rust if we convert...

issue:enhancement

Build and execute our own computation graph

At present, we are using GGML's computation graph. This works well, but it has a few flaws: 1) We're reliant on whatever support GGML has for threading; the Rust threading...

issue:enhancement

meta:maintenance

topic:backend-support

Add an example of the ReAct pattern

I had trouble trying to get this working, but I must've just not provided enough examples. Here's the llama.cpp example! https://github.com/ggerganov/llama.cpp/commit/a6956b25a1c783e5e96fe06c9c00438f846ef047 Should be as simple as adding that to `examples/`.

issue:enhancement

meta:good-first-issue

Compute perplexity

https://github.com/ggerganov/llama.cpp/commit/486ae645fd3eda8b9d7413d5ff34fb65a3e337fb This should be part of the library (potentially part of `InferenceStats`).

enhancement

GPTQ quantization

The GGML quantization strategy works, but results in a measurable loss in quality. To address this, upstream is investigating the use of the GPTQ algorithm, which quantizes in such a...

issue:enhancement

Quantization does not write the quantization version to `ftype`

# Expected Behavior When quantizing with llama.cpp, the quantization version should be written to the `ftype` in the hyperparameters. # Current Behavior A `ftype` is produced by `llama_model_quantize_internal` and is...

good first issue

high priority

ggml : unified file format

Obsoletes #147, #150, https://github.com/ggerganov/llama.cpp/issues/1575, https://github.com/ggerganov/llama.cpp/issues/1590, https://github.com/rustformers/llm/discussions/143, and probably some other issues across some other repositories. Please see the spec PR at #302; the following is left as-is so you can...

documentation

enhancement

help wanted

refactoring

GGUF file format specification

Closes #220. Rendered: https://github.com/philpax/ggml/blob/gguf-spec/docs/gguf.md Defines a complete specification for the proposed GGUF file format, which should generically describe models to be loaded by any compatible executor. This is a first...

Structured data extraction/known results for test cases

Hi there! First off, thanks for this - it's great and as-is it's given me some ideas for prompt design 🙏 I'm working on trying to extract dates from arbitrary...

Multi-line prompts are difficult to retrieve from the result table

While working on the problem described in #15, I completed one run and had a table produced of the results. Unfortunately, the resulting prompts are multi-line and quite long, which...