llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Main.exe from release VS Main.exe build locally => different behaviour

Open sergedc opened this issue 2 years ago • 8 comments

Expected Behavior

I expect to get the same behavior when I use the main.exe from llama-master-e7f6997-bin-win-avx2-x64.zip and when I download the source code and build the main myself.

FYI, I run the main.exe with these parameters: main -m ../../model/ggml-gpt4-x-13b-q4_1.bin --color -f ./alpaca.txt -ins -b 512 -c 2048 -n 2048 --top_k 10000 --temp 0.2 --repeat_penalty 1 -t 7

Current Behavior

On the main.exe from the zip: after I ask a question, the AI answers, and I am presented with ">" for my next question On the main.exe that I build myself (windows, cmake, gcc and g++), following the exact instructions provided: after I ask a question, the AI answers, and ">" appears and then the AI keep writing (usually about a sport event , e.g. fifa, that happened in 2018)

Screenshot 2023-04-12 180150

Environment and Context

Windows 11

32 GB ram Ryzen CPU GCC/G++: -- The C compiler identification is GNU 11.2.0 -- The CXX compiler identification is GNU 11.2.0 cmake version 3.26.3

sergedc avatar Apr 12 '23 22:04 sergedc

--seed?

rabidcopy avatar Apr 12 '23 22:04 rabidcopy

--seed?

I just tried both again with "--seed 5" and same behaviour. I tried on another build from 5 days ago, and same behaviour.

I should also mentioned that the downloaded main.exe from the zip is 216kb, while the one I build myself is 3,262kb....

sergedc avatar Apr 12 '23 22:04 sergedc

The size of the exe makes me think you are building a debug version and not a release version.

With regards to it continuing to output after it should stop, I have noticed all the models seem to have a sweet spot for the parameters depending upon what you are trying to get it to do. For yours, try changing n to -1

tcristo avatar Apr 12 '23 23:04 tcristo

Hi, regarding exe file size my instructions were, from the build folder: cmake -G "MinGW Makefiles" .. cmake --build . --config Release

With regards to "--n -1" , it did not change anything. It still keeps talking after I got the answer.

sergedc avatar Apr 13 '23 00:04 sergedc

What is the expected way of building this? Would installing visual studio and using that possibly solve the problem?

I am trying to debug the problem: in the main.cpp:

 if (!std::getline(std::wcin, wline)) {
                        // input stream is bad or EOF received
                        return 0;
                    }
                    win32_utf8_encode(wline, line);
                    std::ofstream outfile2("test.csv");
                    outfile2 << line.c_str();
                    outfile2.close();

In the cvs file, there is not thing! No space, just 1 line 1 column with nothing.

sergedc avatar Apr 13 '23 18:04 sergedc

Interesting: the flow goes like this:

  1. 1st question
  2. 1st answer
  3. Random answer
  4. 2nd question
  5. 2nd answer
  6. Random answer (the exact same random answer as in 3.)

sergedc avatar Apr 13 '23 18:04 sergedc

What is the expected way of building this? Would installing visual studio and using that possibly solve the problem?

I am trying to debug the problem: in the main.cpp:

 if (!std::getline(std::wcin, wline)) {
                        // input stream is bad or EOF received
                        return 0;
                    }
                    win32_utf8_encode(wline, line);
                    std::ofstream outfile2("test.csv");
                    outfile2 << line.c_str();
                    outfile2.close();

In the cvs file, there is not thing! No space, just 1 line 1 column with nothing.

I use vs2022 to build my windows executable. My release version is around 200k and my debug is around 1M. My release version is considerably faster than my debug version.

I don't currently use the same model as you. I'm using variants of both Vicuna and Koala right now. Once the parameters are fine tuned, they are as consistent as the current models of LLMs allow them to be.

tcristo avatar Apr 13 '23 19:04 tcristo

Thanks. I will try with VS2022

If I might ask, what parameters are you using for what purpose?

Me, I am using it to summarize 5 to 10 lines of text. As well as comparing short pieces of text for similarities and difference. What parameters would you recommend?

Could you point me to the Vicuna and Koala variants you are using? Are they unfiltered? I think that also makes an impact on quality.

Thanks!

sergedc avatar Apr 13 '23 20:04 sergedc

FYI, using VS2022, this problem disappeared.

sergedc avatar Apr 22 '23 16:04 sergedc