ollama icon indicating copy to clipboard operation
ollama copied to clipboard

added logprobs (`n_probs`)

Open janpf opened this issue 1 year ago β€’ 39 comments

As discussed on discord I implemented the feature. It just passes through the probs from the llamacpp server. Sorry, first time writing Go, might have missed something.

https://discord.com/channels/1128867683291627614/1128867684130508875/1187028494228664340

janpf avatar Dec 20 '23 19:12 janpf

Just noticed the entire llama.go file got rewritten in the meantime :/ maybe the person refactoring llama.go could give a hint at where to implement it now? πŸ₯ΊπŸ‘‰πŸ»πŸ‘ˆπŸ» I took a quick look in the online merge editor but it’s not trivial.

Happy Holidays!

janpf avatar Dec 22 '23 20:12 janpf

janpf@p085info010013 ~> curl http://localhost:11434/api/generate -d '{"model":"llama2", "temperature":0, "prompt": "Give me your probabilities", "options": {"num_predict": 2, "n_probs":2}}' | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1722    0  1603  100   119   4734    351 --:--:-- --:--:-- --:--:--  5094
{
  "model": "llama2",
  "created_at": "2024-01-08T12:11:42.615133Z",
  "response": "\n",
  "completion_probabilities": [
    {
      "content": "\n",
      "probs": [
        {
          "prob": 0.5468829274177551,
          "tok_str": "As"
        },
        {
          "prob": 0.38397082686424255,
          "tok_str": "\n"
        }
      ]
    }
  ],
  "done": false
}
{
  "model": "llama2",
  "created_at": "2024-01-08T12:11:42.632652Z",
  "response": "As",
  "completion_probabilities": [
    {
      "content": "As",
      "probs": [
        {
          "prob": 0.6612563133239746,
          "tok_str": "As"
        },
        {
          "prob": 0.3387437164783478,
          "tok_str": "I"
        }
      ]
    }
  ],
  "done": false
}
{
  "model": "llama2",
  "created_at": "2024-01-08T12:11:42.650095Z",
  "response": " a",
  "completion_probabilities": [
    {
      "content": " a",
      "probs": [
        {
          "prob": 0.9904254078865051,
          "tok_str": " a"
        },
        {
          "prob": 0.009574598632752895,
          "tok_str": " an"
        }
      ]
    }
  ],
  "done": false
}
{
  "model": "llama2",
  "created_at": "2024-01-08T12:11:42.667608Z",
  "response": " responsible",
  "completion_probabilities": [
    {
      "content": " responsible",
      "probs": [
        {
          "prob": 0.9597197771072388,
          "tok_str": " responsible"
        },
        {
          "prob": 0.04028024524450302,
          "tok_str": " neutral"
        }
      ]
    }
  ],
  "done": false
}
{
  "model": "llama2",
  "created_at": "2024-01-08T12:11:42.685076Z",
  "response": " A",
  "completion_probabilities": [
    {
      "content": " A",
      "probs": [
        {
          "prob": 0.9497269988059998,
          "tok_str": " A"
        },
        {
          "prob": 0.050273049622774124,
          "tok_str": " and"
        }
      ]
    }
  ],
  "done": false
}
{
  "model": "llama2",
  "created_at": "2024-01-08T12:11:42.685199Z",
  "response": "",
  "done": true,
  "context": [
    518,
    25580,
    29962,
    3532,
    14816,
    29903,
    29958,
    5299,
    829,
    14816,
    29903,
    6778,
    13,
    13,
    29954,
    573,
    592,
    596,
    2070,
    11614,
    518,
    29914,
    25580,
    29962,
    13,
    13,
    2887,
    263,
    14040,
    319
  ],
  "total_duration": 337091792,
  "load_duration": 3560750,
  "prompt_eval_count": 11,
  "prompt_eval_duration": 267993000,
  "eval_count": 4,
  "eval_duration": 52417000
}

janpf avatar Jan 08 '24 12:01 janpf

Hi, any indication of when this might be merged?

briancleland avatar Feb 12 '24 15:02 briancleland

Is there any problem with merging this? I can probably help out if there is something more needs doing. This is blocking my usecase and I would love to get away from using llama.cpp directly.

caj-larsson avatar Mar 20 '24 11:03 caj-larsson

Just bumping this again to see if it can be merged soon. I'm sure logprobs would be useful for a lot of people.

briancleland avatar Mar 23 '24 23:03 briancleland

hi @bmizerany , can this PR be merged? It's a very useful feature missing from Ollama currently

iurimatias avatar Mar 25 '24 16:03 iurimatias

Thank you for your PR. We're working on things this could break or might be broken by, so we're going to wait until we've had more time to make sure this lines up with those. Thank you for your patience and support!

bmizerany avatar Mar 25 '24 19:03 bmizerany

Thank you for your PR. We're working on things this could break or might be broken by, so we're going to wait until we've had more time to make sure this lines up with those. Thank you for your patience and support!

Sorry for my ignorance but this is just exposing the core functionality of the library this project is a wrapper for.

Mind linking that conversation? I'm happy to help

caj-larsson avatar Mar 26 '24 06:03 caj-larsson

Thank you for your PR. We're working on things this could break or might be broken by, so we're going to wait until we've had more time to make sure this lines up with those. Thank you for your patience and support!

Hi @bmizerany, thanks for the update. Do you have an approximate timescale for this work?

briancleland avatar Mar 28 '24 11:03 briancleland

@briancleland We do not currently have a timeline to support this.

bmizerany avatar Apr 08 '24 18:04 bmizerany

We're working on things this could break or might be broken by, so we're going to wait until we've had more time to make sure this lines up with those.

@bmizerany On reflection, I think my last question was not very clear. What I meant was, do you know when the work referred to above might be completed?

briancleland avatar May 06 '24 15:05 briancleland

@briancleland @janpf Anyone else feel like it might be time for a hard fork with emphasis on full LLM-capabilities exposed in APIs?

caj-larsson avatar May 08 '24 07:05 caj-larsson

@caj-larsson I would hope that wouldn't be necessary, but the unhelpful response to this PR is a bit puzzling.

briancleland avatar May 11 '24 05:05 briancleland

Ollama is an amazing piece of tech! I have used it extensively for quite some time -- huge kudos to the team! I'm also interested in logprobs being exposed for two reasons

  1. To maintain compatibility with the OpenAI library which does include it
  2. To provide better insight into the quality of the result.

Very much looking forward to seeing this integrated! Happy to help but I would expect this is not a starter-contributor ticket.

boxabirds avatar May 13 '24 08:05 boxabirds

Hello there, I'm also really looking forward to using logprobs for my project. Any news on when this feature will be integrated? Greetings

MaxOmlor avatar May 15 '24 13:05 MaxOmlor

Hello there, I'm also really looking forward to using logprobs for my project. Any news on when this feature will be integrated? Greetings

Hello, Same for evaluating models with ollama through lm-evaluation-harness. Very much looking forward to seeing this integrated!

eliot-christon avatar May 15 '24 14:05 eliot-christon

@jmorganca Any thoughts on how to progress this?

briancleland avatar May 15 '24 17:05 briancleland

I would like to push for this feature as well. It's critical for confidence evaluation using both logprobs and entropy. (e.g., https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00598/117737)

The-Inscrutable-X avatar May 23 '24 02:05 The-Inscrutable-X

Top layer logits would be very useful for potential transfer learning, for example with additional layers.

wiiiktor avatar May 27 '24 22:05 wiiiktor

@briancleland @janpf Anyone else feel like it might be time for a hard fork with emphasis on full LLM-capabilities exposed in APIs?

I am like 99.99% with you here, where the missing 0.01% is not knowing any golang at all. It seems the lesser effort would be figuring how to expose capabilities already in llama.cpp than to deal with managing the model ecosystem without ollama.

TensorTemplar avatar May 31 '24 10:05 TensorTemplar

I can not believe that implementing such simple and useful feature has been pending for over 7 months with no clear path forward! This is more like a simple print statement! I offer my help to fixing this issue if I know whats pending.

OriginalGoku avatar Jul 25 '24 17:07 OriginalGoku

I can not believe that implementing such simple and useful feature has been pending for over 7 months with no clear path forward! This is more like a simple print statement! I offer my help to fixing this issue if I know whats pending.

I also do not understand why ollama does not support logits. It is a fundamental aspect of an LLM. A lot of serious algorithms, that enhance the quality of the LLM output, rely on logits.

Please, Ollama team, do support logits ASAP. Thank you!

drdsgvo avatar Jul 29 '24 09:07 drdsgvo

Get it together Ollama team. This has been sitting unmerged for 7 months.

Bruno-TT avatar Jul 29 '24 10:07 Bruno-TT

Any news here? Why is this taking so long?

josiahbryan avatar Jul 29 '24 11:07 josiahbryan