ggml icon indicating copy to clipboard operation
ggml copied to clipboard

[Bug] Zero temperature yields incorrect results

Open guberti opened this issue 1 year ago • 0 comments

Passing the parameter --temp 0 causes GPT-J (and I suspect all other models) to behave very strangely. See the following output:

guberti@Precision-7540:~/ggml/build$ ./bin/gpt-j -p "Once upon a time there was a" --temp 0
main: seed = 1678657256
gptj_model_load: loading model from 'models/gpt-j-6B/ggml-model.bin' - please wait ...
gptj_model_load: n_vocab = 50400
gptj_model_load: n_ctx   = 2048
gptj_model_load: n_embd  = 4096
gptj_model_load: n_head  = 16
gptj_model_load: n_layer = 28
gptj_model_load: n_rot   = 64
gptj_model_load: f16     = 1
gptj_model_load: ggml ctx size = 13334.86 MB
gptj_model_load: memory_size =  1792.00 MB, n_mem = 57344
gptj_model_load: ................................... done
gptj_model_load: model size = 11542.79 MB / num tensors = 285
main: number of tokens in prompt = 7

Once upon a time there was aGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG

No matter what prompt is used, the model repeatedly generates the token G.

Solution

This is caused by a division-by-zero error in examples/utils.cpp:

{
    const double scale = 1.0/temp;
    for (int i = 0; i < n_logits; ++i) {
        logits_id.push_back(std::make_pair(logits[i]*scale, i));
    }
}

Happy to contribute a simple fix if @ggerganov is busy.

guberti avatar Mar 12 '23 21:03 guberti