llama.cpp Feature Request: Support for large reward-type models (Nemotron-4-340B-Reward)

Feature Request: Support for large reward-type models (Nemotron-4-340B-Reward)

Open tblattner opened this issue 6 months ago • 1 comments

Prerequisites

[X] I am running the latest code. Mention the version if possible as well.
[X] I carefully followed the README.md.
[X] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
[X] I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

Llama.cpp scalability has allowed groups to experiment with LLMs. NVIDIA released the Nemotron-4-340B-Reward model that alters the last layer to rate prompts based on 5 criteria: Helpfulness, Correctness, Coherence, Complexity, and Verbosity.

Having the ability to support easily updating the output logit processing would help support these types of changes in LLMs.

Motivation

Adding support to new types of models that target specific areas such as reward modeling.

Possible Implementation

I'm interested in pointers in how to implement such a feature, I would be happy to take a closer look to see how easy/hard such a feature would be to implement.

Aug 27 '24 15:08 tblattner

llama.cpp llama.cpp copied to clipboard

Feature Request: Support for large reward-type models (Nemotron-4-340B-Reward)

Prerequisites

Feature Description

Motivation

Possible Implementation

llama.cpp
llama.cpp copied to clipboard