Fomenko comments

Results 16 comments of


                                            Fomenko

Hallucination on successive generation

Please provide more Information about your Hardware and Software Versions. And which Model version are you using.

Hallucination on successive generation

> CUDA isn't deterministic unl You need to provide more Information, like Model, the Temperature rate, the Quantization type and so on. In my Opinion the Quantization is the Problem.

AMD ROCm problem: GPU is constantly running at 100%

> make LLAMA_HIPBLAS=1 -j4 ``` (base) user@myusb:~/llama.cpp$ make LLAMA_HIPBLAS=1 -j4 I llama.cpp build info: I UNAME_S: Linux I UNAME_P: x86_64 I UNAME_M: x86_64 I CFLAGS: -I. -Icommon -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -DNDEBUG...

AMD ROCm problem: GPU is constantly running at 100%

ok I compiled it and now when I run: ``` (base) user@myusb:~/llama.cpp$ ./main -i -m ~/mistral-7b-instruct-v0.2.Q8_0.gguf -ngl 999 Log start main: build = 2047 (b05102fe) main: built with cc (Ubuntu...

AMD ROCm problem: GPU is constantly running at 100%

``` (base) user@myusb:~/llama.cpp$ rocm-smi ======================================== ROCm System Management Interface ======================================== ================================================== Concise Info ================================================== Device [Model : Revision] Temp Power Partitions SCLK MCLK Fan Perf PwrCap VRAM% GPU% Name (20...

AMD ROCm problem: GPU is constantly running at 100%

### Transformers **Here is Transformers, it is working properly if you Chat** ![Bildschirmfoto vom 2024-02-02 13-10-19](https://github.com/ggerganov/llama.cpp/assets/12229584/a755511f-5435-4886-99ed-cee352296036) **And this is the behavior if Transformers if you don't Chat, only load the...