Lukas Kreussel comments

Results 114 comments of


                                            Lukas Kreussel

cuBLAS error

@cometyang This is a real headscratcher for me. I'm guessing `llama.cpp` still works fine on your machine? I'm probably gonna downgrade my cuda version to 11.8 in my wsl instance...

Bloom 176B inference is broken

Hm that's a tricky one. Most likely the hyperparameters of the model diverge from the format defined by `llama.cpp` but `llm` should be able to handle that. Could you split...

Bloom 176B inference is broken

After taking a short look at the hyperparameters they look valid. This file was probably created by an older version of ggml, where they didn't adjust the tensor metadata size...

ggml : unified file format

Could we also include some optional generation parameters. Which contain default values for some sampling parameters? Or would that be to specific?

Will you please guide how to run the conversion script?

Im pretty sure that either your model isn't a fully compatible GPT-J model or there are differenzes in the tokenizer. Have you tried to load you GPT-J converted model with...

Supporting Llama-2 70B param

If you want you could create a PR containing these changes. The only thing i don't like about it is that the `ModelParameters` will then contain the `n_gqa` parameter which...

Supporting Llama-2 70B param

In my oppinion we should just hack it in for now, GGUF seams to be nearly ready meaning when we implement it we can cleanup the implementation. The most important...

GPU Not Utilized When Using llm-rs with CUDA Version

Currently only `llama` based models are accelerated by metal/cuda/opencl. If you use another architecture like `gpt-neox` it will fallback to cpu only inference. What you are seeing in your std-out...

Problems when i try to use this inside the default python 3.10 docker container

Alright i cant send you the Dockerfile, but i created a toy-example with your own server. Dockerfile: ``` FROM python:3.10 #install RUN pip3 install llama-cpp-python[server] #Expose the ports EXPOSE 8000...

Problems when i try to use this inside the default python 3.10 docker container

Thats the strange thing the dockerfile listed above works without any problems. But if im trying to run my Dockerfile im getting the "GLIBC" error. This is my Dockerfile: ```...