Main.exe from release VS Main.exe build locally => different behaviour
Expected Behavior
I expect to get the same behavior when I use the main.exe from llama-master-e7f6997-bin-win-avx2-x64.zip and when I download the source code and build the main myself.
FYI, I run the main.exe with these parameters: main -m ../../model/ggml-gpt4-x-13b-q4_1.bin --color -f ./alpaca.txt -ins -b 512 -c 2048 -n 2048 --top_k 10000 --temp 0.2 --repeat_penalty 1 -t 7
Current Behavior
On the main.exe from the zip: after I ask a question, the AI answers, and I am presented with ">" for my next question On the main.exe that I build myself (windows, cmake, gcc and g++), following the exact instructions provided: after I ask a question, the AI answers, and ">" appears and then the AI keep writing (usually about a sport event , e.g. fifa, that happened in 2018)

Environment and Context
Windows 11
32 GB ram Ryzen CPU GCC/G++: -- The C compiler identification is GNU 11.2.0 -- The CXX compiler identification is GNU 11.2.0 cmake version 3.26.3
--seed?
--seed?
I just tried both again with "--seed 5" and same behaviour. I tried on another build from 5 days ago, and same behaviour.
I should also mentioned that the downloaded main.exe from the zip is 216kb, while the one I build myself is 3,262kb....
The size of the exe makes me think you are building a debug version and not a release version.
With regards to it continuing to output after it should stop, I have noticed all the models seem to have a sweet spot for the parameters depending upon what you are trying to get it to do. For yours, try changing n to -1
Hi, regarding exe file size my instructions were, from the build folder: cmake -G "MinGW Makefiles" .. cmake --build . --config Release
With regards to "--n -1" , it did not change anything. It still keeps talking after I got the answer.
What is the expected way of building this? Would installing visual studio and using that possibly solve the problem?
I am trying to debug the problem: in the main.cpp:
if (!std::getline(std::wcin, wline)) {
// input stream is bad or EOF received
return 0;
}
win32_utf8_encode(wline, line);
std::ofstream outfile2("test.csv");
outfile2 << line.c_str();
outfile2.close();
In the cvs file, there is not thing! No space, just 1 line 1 column with nothing.
Interesting: the flow goes like this:
- 1st question
- 1st answer
- Random answer
- 2nd question
- 2nd answer
- Random answer (the exact same random answer as in 3.)
What is the expected way of building this? Would installing visual studio and using that possibly solve the problem?
I am trying to debug the problem: in the main.cpp:
if (!std::getline(std::wcin, wline)) { // input stream is bad or EOF received return 0; } win32_utf8_encode(wline, line); std::ofstream outfile2("test.csv"); outfile2 << line.c_str(); outfile2.close();In the cvs file, there is not thing! No space, just 1 line 1 column with nothing.
I use vs2022 to build my windows executable. My release version is around 200k and my debug is around 1M. My release version is considerably faster than my debug version.
I don't currently use the same model as you. I'm using variants of both Vicuna and Koala right now. Once the parameters are fine tuned, they are as consistent as the current models of LLMs allow them to be.
Thanks. I will try with VS2022
If I might ask, what parameters are you using for what purpose?
Me, I am using it to summarize 5 to 10 lines of text. As well as comparing short pieces of text for similarities and difference. What parameters would you recommend?
Could you point me to the Vicuna and Koala variants you are using? Are they unfiltered? I think that also makes an impact on quality.
Thanks!
FYI, using VS2022, this problem disappeared.