ollama icon indicating copy to clipboard operation
ollama copied to clipboard

added logprobs (`n_probs`)

Open janpf opened this issue 6 months ago β€’ 18 comments

As discussed on discord I implemented the feature. It just passes through the probs from the llamacpp server. Sorry, first time writing Go, might have missed something.

https://discord.com/channels/1128867683291627614/1128867684130508875/1187028494228664340

janpf avatar Dec 20 '23 19:12 janpf

Just noticed the entire llama.go file got rewritten in the meantime :/ maybe the person refactoring llama.go could give a hint at where to implement it now? πŸ₯ΊπŸ‘‰πŸ»πŸ‘ˆπŸ» I took a quick look in the online merge editor but it’s not trivial.

Happy Holidays!

janpf avatar Dec 22 '23 20:12 janpf

janpf@p085info010013 ~> curl http://localhost:11434/api/generate -d '{"model":"llama2", "temperature":0, "prompt": "Give me your probabilities", "options": {"num_predict": 2, "n_probs":2}}' | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1722    0  1603  100   119   4734    351 --:--:-- --:--:-- --:--:--  5094
{
  "model": "llama2",
  "created_at": "2024-01-08T12:11:42.615133Z",
  "response": "\n",
  "completion_probabilities": [
    {
      "content": "\n",
      "probs": [
        {
          "prob": 0.5468829274177551,
          "tok_str": "As"
        },
        {
          "prob": 0.38397082686424255,
          "tok_str": "\n"
        }
      ]
    }
  ],
  "done": false
}
{
  "model": "llama2",
  "created_at": "2024-01-08T12:11:42.632652Z",
  "response": "As",
  "completion_probabilities": [
    {
      "content": "As",
      "probs": [
        {
          "prob": 0.6612563133239746,
          "tok_str": "As"
        },
        {
          "prob": 0.3387437164783478,
          "tok_str": "I"
        }
      ]
    }
  ],
  "done": false
}
{
  "model": "llama2",
  "created_at": "2024-01-08T12:11:42.650095Z",
  "response": " a",
  "completion_probabilities": [
    {
      "content": " a",
      "probs": [
        {
          "prob": 0.9904254078865051,
          "tok_str": " a"
        },
        {
          "prob": 0.009574598632752895,
          "tok_str": " an"
        }
      ]
    }
  ],
  "done": false
}
{
  "model": "llama2",
  "created_at": "2024-01-08T12:11:42.667608Z",
  "response": " responsible",
  "completion_probabilities": [
    {
      "content": " responsible",
      "probs": [
        {
          "prob": 0.9597197771072388,
          "tok_str": " responsible"
        },
        {
          "prob": 0.04028024524450302,
          "tok_str": " neutral"
        }
      ]
    }
  ],
  "done": false
}
{
  "model": "llama2",
  "created_at": "2024-01-08T12:11:42.685076Z",
  "response": " A",
  "completion_probabilities": [
    {
      "content": " A",
      "probs": [
        {
          "prob": 0.9497269988059998,
          "tok_str": " A"
        },
        {
          "prob": 0.050273049622774124,
          "tok_str": " and"
        }
      ]
    }
  ],
  "done": false
}
{
  "model": "llama2",
  "created_at": "2024-01-08T12:11:42.685199Z",
  "response": "",
  "done": true,
  "context": [
    518,
    25580,
    29962,
    3532,
    14816,
    29903,
    29958,
    5299,
    829,
    14816,
    29903,
    6778,
    13,
    13,
    29954,
    573,
    592,
    596,
    2070,
    11614,
    518,
    29914,
    25580,
    29962,
    13,
    13,
    2887,
    263,
    14040,
    319
  ],
  "total_duration": 337091792,
  "load_duration": 3560750,
  "prompt_eval_count": 11,
  "prompt_eval_duration": 267993000,
  "eval_count": 4,
  "eval_duration": 52417000
}

janpf avatar Jan 08 '24 12:01 janpf

Hi, any indication of when this might be merged?

briancleland avatar Feb 12 '24 15:02 briancleland

Is there any problem with merging this? I can probably help out if there is something more needs doing. This is blocking my usecase and I would love to get away from using llama.cpp directly.

caj-larsson avatar Mar 20 '24 11:03 caj-larsson

Just bumping this again to see if it can be merged soon. I'm sure logprobs would be useful for a lot of people.

briancleland avatar Mar 23 '24 23:03 briancleland

hi @bmizerany , can this PR be merged? It's a very useful feature missing from Ollama currently

iurimatias avatar Mar 25 '24 16:03 iurimatias

Thank you for your PR. We're working on things this could break or might be broken by, so we're going to wait until we've had more time to make sure this lines up with those. Thank you for your patience and support!

bmizerany avatar Mar 25 '24 19:03 bmizerany

Thank you for your PR. We're working on things this could break or might be broken by, so we're going to wait until we've had more time to make sure this lines up with those. Thank you for your patience and support!

Sorry for my ignorance but this is just exposing the core functionality of the library this project is a wrapper for.

Mind linking that conversation? I'm happy to help

caj-larsson avatar Mar 26 '24 06:03 caj-larsson

Thank you for your PR. We're working on things this could break or might be broken by, so we're going to wait until we've had more time to make sure this lines up with those. Thank you for your patience and support!

Hi @bmizerany, thanks for the update. Do you have an approximate timescale for this work?

briancleland avatar Mar 28 '24 11:03 briancleland

@briancleland We do not currently have a timeline to support this.

bmizerany avatar Apr 08 '24 18:04 bmizerany

We're working on things this could break or might be broken by, so we're going to wait until we've had more time to make sure this lines up with those.

@bmizerany On reflection, I think my last question was not very clear. What I meant was, do you know when the work referred to above might be completed?

briancleland avatar May 06 '24 15:05 briancleland

@briancleland @janpf Anyone else feel like it might be time for a hard fork with emphasis on full LLM-capabilities exposed in APIs?

caj-larsson avatar May 08 '24 07:05 caj-larsson

@caj-larsson I would hope that wouldn't be necessary, but the unhelpful response to this PR is a bit puzzling.

briancleland avatar May 11 '24 05:05 briancleland

Ollama is an amazing piece of tech! I have used it extensively for quite some time -- huge kudos to the team! I'm also interested in logprobs being exposed for two reasons

  1. To maintain compatibility with the OpenAI library which does include it
  2. To provide better insight into the quality of the result.

Very much looking forward to seeing this integrated! Happy to help but I would expect this is not a starter-contributor ticket.

boxabirds avatar May 13 '24 08:05 boxabirds