llama.cpp issues

Comparaison Windows Build VS Unix Build (through WSL2)

20

# Environment and Context Hello, Before jumping to the subject, here's the environnement I'm working with: - Windows 10 - Llama-13b-4bit-(GPTQ quantized) model - Intel® Core™ i7-10700K [AVX | AVX2...

BadisG

Build your windows binaries with Clang and not MSVC.

2

Hello, Your [windows binaries releases](https://github.com/ggerganov/llama.cpp/releases) have probably been built with MSVC and I think there's a better way to do it. # Expected Behavior I have a Intel® Core™ i7-10700K...

BadisG

[Feature Suggestion] Load/Save current conversation's tokens into file

1

Now that we have infinite transcription mode. Would it be possible to dump tokens into file and load them back next time you run llama.cpp to resume conversation? Although it...

x02Sylvie

Add backwards-compatibility for older model format

1

Add support for reading older model files so that people do not have to throw out ggml alpaca models.

thement

Is it possible to avoid printing input when using Alpaca models and prompt from file?

I want to use prompt from file using `-f` options and alpaca models. Nevertheless, when I use like that, the llama.cpp first prints out the whole input. How to avoid...

DanielWicz

Continue on empty line

Do not insert a "newline" token if user inputs empty line. This let's user to continue the output after she has been asked by reverse prompt for more data. Otherwise...

thement

Refactor quantized processing functions

1

To avoid code duplication when implementing additional quantization formats (#456), refactor the `forward_mul_mat` and `forward_get_rows` functions to use a table of function pointers, indexed by `ggml_type`. This makes some functions...

sw

Fails to run inside Docker from Ubuntu 22.04

2

# Prerequisites Please answer the following questions for yourself before submitting an issue. - [ :white_check_mark: ] I am running the latest code. Development is very rapid so there are...

tadasgedgaudas

bug

build

Infinity transcript mode may stuck in ram?

1

After ctx > 2048 or whatever set in -c, While close the terminal, the transcript may have a chance continuously running in system. Linux amd64 5.19 ubuntu base.

FNsi

add support for llama adapters

4

implement support for running models that use Llama adapter https://github.com/ZrrSkywalker/LLaMA-Adapter described here how to get the model https://github.com/ZrrSkywalker/LLaMA-Adapter#inference

redthing1

enhancement

model

llama.cpp
llama.cpp copied to clipboard

Metadata

Comparaison Windows Build VS Unix Build (through WSL2)

Build your windows binaries with Clang and not MSVC.

[Feature Suggestion] Load/Save current conversation's tokens into file

Add backwards-compatibility for older model format

Is it possible to avoid printing input when using Alpaca models and prompt from file?

Continue on empty line

Refactor quantized processing functions

Fails to run inside Docker from Ubuntu 22.04

Infinity transcript mode may stuck in ram?

add support for llama adapters

← Metadata

Owner

Metadata

llama.cpp llama.cpp copied to clipboard

Metadata

← Metadata

Owner

Metadata

llama.cpp
llama.cpp copied to clipboard