gpt4all-chat icon indicating copy to clipboard operation
gpt4all-chat copied to clipboard

Add compatibility with new sampling algorithms in llama.cpp

Open kuvaus opened this issue 1 year ago • 1 comments

Title: Add compatibility with new sampling algorithms in llama.cpp

Description: This pull request addresses issue https://github.com/nomic-ai/gpt4all-chat/issues/200#issue-1689677866 by adding compatibility with new sampling algorithms in llama.cpp.

Changes:

Implemented temperature sampling with repetition penalty as an alternative to the previous llama_sample_top_p_top_k sampling method.

        // Temperature sampling with repetition_penalty
        llama_sample_repetition_penalty(
            d_ptr->ctx, &candidates_data,
            promptCtx.tokens.data() + promptCtx.n_ctx - promptCtx.repeat_last_n, promptCtx.repeat_last_n,
            promptCtx.repeat_penalty);
        llama_sample_top_k(d_ptr->ctx, &candidates_data, promptCtx.top_k);
        llama_sample_top_p(d_ptr->ctx, &candidates_data, promptCtx.top_p);
        llama_sample_temperature(d_ptr->ctx, &candidates_data, promptCtx.temp);
        llama_token id = llama_sample_token(d_ptr->ctx, &candidates_data);

kuvaus avatar Apr 30 '23 12:04 kuvaus

I will look at this, but will need to update the submodule at the same time otherwise this will break. But this helps a ton! Thanks @kuvaus !

manyoso avatar Apr 30 '23 12:04 manyoso