ollama added logprobs (`n

As discussed on discord I implemented the feature. It just passes through the probs from the llamacpp server. Sorry, first time writing Go, might have missed something.

https://discord.com/channels/1128867683291627614/1128867684130508875/1187028494228664340

Dec 20 '23 19:12 janpf

Just noticed the entire llama.go file got rewritten in the meantime :/ maybe the person refactoring llama.go could give a hint at where to implement it now? 🥺👉🏻👈🏻 I took a quick look in the online merge editor but it’s not trivial.

Happy Holidays!

Dec 22 '23 20:12 janpf

janpf@p085info010013 ~> curl http://localhost:11434/api/generate -d '{"model":"llama2", "temperature":0, "prompt": "Give me your probabilities", "options": {"num_predict": 2, "n_probs":2}}' | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1722    0  1603  100   119   4734    351 --:--:-- --:--:-- --:--:--  5094
{
  "model": "llama2",
  "created_at": "2024-01-08T12:11:42.615133Z",
  "response": "\n",
  "completion_probabilities": [
    {
      "content": "\n",
      "probs": [
        {
          "prob": 0.5468829274177551,
          "tok_str": "As"
        },
        {
          "prob": 0.38397082686424255,
          "tok_str": "\n"
        }
      ]
    }
  ],
  "done": false
}
{
  "model": "llama2",
  "created_at": "2024-01-08T12:11:42.632652Z",
  "response": "As",
  "completion_probabilities": [
    {
      "content": "As",
      "probs": [
        {
          "prob": 0.6612563133239746,
          "tok_str": "As"
        },
        {
          "prob": 0.3387437164783478,
          "tok_str": "I"
        }
      ]
    }
  ],
  "done": false
}
{
  "model": "llama2",
  "created_at": "2024-01-08T12:11:42.650095Z",
  "response": " a",
  "completion_probabilities": [
    {
      "content": " a",
      "probs": [
        {
          "prob": 0.9904254078865051,
          "tok_str": " a"
        },
        {
          "prob": 0.009574598632752895,
          "tok_str": " an"
        }
      ]
    }
  ],
  "done": false
}
{
  "model": "llama2",
  "created_at": "2024-01-08T12:11:42.667608Z",
  "response": " responsible",
  "completion_probabilities": [
    {
      "content": " responsible",
      "probs": [
        {
          "prob": 0.9597197771072388,
          "tok_str": " responsible"
        },
        {
          "prob": 0.04028024524450302,
          "tok_str": " neutral"
        }
      ]
    }
  ],
  "done": false
}
{
  "model": "llama2",
  "created_at": "2024-01-08T12:11:42.685076Z",
  "response": " A",
  "completion_probabilities": [
    {
      "content": " A",
      "probs": [
        {
          "prob": 0.9497269988059998,
          "tok_str": " A"
        },
        {
          "prob": 0.050273049622774124,
          "tok_str": " and"
        }
      ]
    }
  ],
  "done": false
}
{
  "model": "llama2",
  "created_at": "2024-01-08T12:11:42.685199Z",
  "response": "",
  "done": true,
  "context": [
    518,
    25580,
    29962,
    3532,
    14816,
    29903,
    29958,
    5299,
    829,
    14816,
    29903,
    6778,
    13,
    13,
    29954,
    573,
    592,
    596,
    2070,
    11614,
    518,
    29914,
    25580,
    29962,
    13,
    13,
    2887,
    263,
    14040,
    319
  ],
  "total_duration": 337091792,
  "load_duration": 3560750,
  "prompt_eval_count": 11,
  "prompt_eval_duration": 267993000,
  "eval_count": 4,
  "eval_duration": 52417000
}

Jan 08 '24 12:01 janpf

Hi, any indication of when this might be merged?

Feb 12 '24 15:02 briancleland

Is there any problem with merging this? I can probably help out if there is something more needs doing. This is blocking my usecase and I would love to get away from using llama.cpp directly.

Mar 20 '24 11:03 caj-larsson

Just bumping this again to see if it can be merged soon. I'm sure logprobs would be useful for a lot of people.

Mar 23 '24 23:03 briancleland

hi @bmizerany , can this PR be merged? It's a very useful feature missing from Ollama currently

Mar 25 '24 16:03 iurimatias

Thank you for your PR. We're working on things this could break or might be broken by, so we're going to wait until we've had more time to make sure this lines up with those. Thank you for your patience and support!

Mar 25 '24 19:03 bmizerany

Thank you for your PR. We're working on things this could break or might be broken by, so we're going to wait until we've had more time to make sure this lines up with those. Thank you for your patience and support!

Sorry for my ignorance but this is just exposing the core functionality of the library this project is a wrapper for.

Mind linking that conversation? I'm happy to help

Mar 26 '24 06:03 caj-larsson

Thank you for your PR. We're working on things this could break or might be broken by, so we're going to wait until we've had more time to make sure this lines up with those. Thank you for your patience and support!

Hi @bmizerany, thanks for the update. Do you have an approximate timescale for this work?

Mar 28 '24 11:03 briancleland

@briancleland We do not currently have a timeline to support this.

Apr 08 '24 18:04 bmizerany

We're working on things this could break or might be broken by, so we're going to wait until we've had more time to make sure this lines up with those.

@bmizerany On reflection, I think my last question was not very clear. What I meant was, do you know when the work referred to above might be completed?

May 06 '24 15:05 briancleland

@briancleland @janpf Anyone else feel like it might be time for a hard fork with emphasis on full LLM-capabilities exposed in APIs?

May 08 '24 07:05 caj-larsson

@caj-larsson I would hope that wouldn't be necessary, but the unhelpful response to this PR is a bit puzzling.

May 11 '24 05:05 briancleland

Ollama is an amazing piece of tech! I have used it extensively for quite some time -- huge kudos to the team! I'm also interested in logprobs being exposed for two reasons

To maintain compatibility with the OpenAI library which does include it
To provide better insight into the quality of the result.

Very much looking forward to seeing this integrated! Happy to help but I would expect this is not a starter-contributor ticket.

May 13 '24 08:05 boxabirds

Hello there, I'm also really looking forward to using logprobs for my project. Any news on when this feature will be integrated? Greetings

May 15 '24 13:05 MaxOmlor

Hello there, I'm also really looking forward to using logprobs for my project. Any news on when this feature will be integrated? Greetings

Hello, Same for evaluating models with ollama through lm-evaluation-harness. Very much looking forward to seeing this integrated!

May 15 '24 14:05 eliot-christon

@jmorganca Any thoughts on how to progress this?

May 15 '24 17:05 briancleland

I would like to push for this feature as well. It's critical for confidence evaluation using both logprobs and entropy. (e.g., https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00598/117737)

May 23 '24 02:05 The-Inscrutable-X

Top layer logits would be very useful for potential transfer learning, for example with additional layers.

May 27 '24 22:05 wiiiktor

@briancleland @janpf Anyone else feel like it might be time for a hard fork with emphasis on full LLM-capabilities exposed in APIs?

I am like 99.99% with you here, where the missing 0.01% is not knowing any golang at all. It seems the lesser effort would be figuring how to expose capabilities already in llama.cpp than to deal with managing the model ecosystem without ollama.

May 31 '24 10:05 TensorTemplar

I can not believe that implementing such simple and useful feature has been pending for over 7 months with no clear path forward! This is more like a simple print statement! I offer my help to fixing this issue if I know whats pending.

Jul 25 '24 17:07 OriginalGoku

I can not believe that implementing such simple and useful feature has been pending for over 7 months with no clear path forward! This is more like a simple print statement! I offer my help to fixing this issue if I know whats pending.

I also do not understand why ollama does not support logits. It is a fundamental aspect of an LLM. A lot of serious algorithms, that enhance the quality of the LLM output, rely on logits.

Please, Ollama team, do support logits ASAP. Thank you!

Jul 29 '24 09:07 drdsgvo

Get it together Ollama team. This has been sitting unmerged for 7 months.

Jul 29 '24 10:07 Bruno-TT

Any news here? Why is this taking so long?

Jul 29 '24 11:07 josiahbryan

ollama
ollama copied to clipboard

added logprobs (`n_probs`)

ollama ollama copied to clipboard

added logprobs (`n_probs`)

ollama
ollama copied to clipboard