ollama
ollama copied to clipboard
added logprobs (`n_probs`)
As discussed on discord I implemented the feature. It just passes through the probs from the llamacpp server. Sorry, first time writing Go, might have missed something.
https://discord.com/channels/1128867683291627614/1128867684130508875/1187028494228664340
Just noticed the entire llama.go file got rewritten in the meantime :/ maybe the person refactoring llama.go could give a hint at where to implement it now? π₯Ίππ»ππ» I took a quick look in the online merge editor but itβs not trivial.
Happy Holidays!
janpf@p085info010013 ~> curl http://localhost:11434/api/generate -d '{"model":"llama2", "temperature":0, "prompt": "Give me your probabilities", "options": {"num_predict": 2, "n_probs":2}}' | jq
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1722 0 1603 100 119 4734 351 --:--:-- --:--:-- --:--:-- 5094
{
"model": "llama2",
"created_at": "2024-01-08T12:11:42.615133Z",
"response": "\n",
"completion_probabilities": [
{
"content": "\n",
"probs": [
{
"prob": 0.5468829274177551,
"tok_str": "As"
},
{
"prob": 0.38397082686424255,
"tok_str": "\n"
}
]
}
],
"done": false
}
{
"model": "llama2",
"created_at": "2024-01-08T12:11:42.632652Z",
"response": "As",
"completion_probabilities": [
{
"content": "As",
"probs": [
{
"prob": 0.6612563133239746,
"tok_str": "As"
},
{
"prob": 0.3387437164783478,
"tok_str": "I"
}
]
}
],
"done": false
}
{
"model": "llama2",
"created_at": "2024-01-08T12:11:42.650095Z",
"response": " a",
"completion_probabilities": [
{
"content": " a",
"probs": [
{
"prob": 0.9904254078865051,
"tok_str": " a"
},
{
"prob": 0.009574598632752895,
"tok_str": " an"
}
]
}
],
"done": false
}
{
"model": "llama2",
"created_at": "2024-01-08T12:11:42.667608Z",
"response": " responsible",
"completion_probabilities": [
{
"content": " responsible",
"probs": [
{
"prob": 0.9597197771072388,
"tok_str": " responsible"
},
{
"prob": 0.04028024524450302,
"tok_str": " neutral"
}
]
}
],
"done": false
}
{
"model": "llama2",
"created_at": "2024-01-08T12:11:42.685076Z",
"response": " A",
"completion_probabilities": [
{
"content": " A",
"probs": [
{
"prob": 0.9497269988059998,
"tok_str": " A"
},
{
"prob": 0.050273049622774124,
"tok_str": " and"
}
]
}
],
"done": false
}
{
"model": "llama2",
"created_at": "2024-01-08T12:11:42.685199Z",
"response": "",
"done": true,
"context": [
518,
25580,
29962,
3532,
14816,
29903,
29958,
5299,
829,
14816,
29903,
6778,
13,
13,
29954,
573,
592,
596,
2070,
11614,
518,
29914,
25580,
29962,
13,
13,
2887,
263,
14040,
319
],
"total_duration": 337091792,
"load_duration": 3560750,
"prompt_eval_count": 11,
"prompt_eval_duration": 267993000,
"eval_count": 4,
"eval_duration": 52417000
}
Hi, any indication of when this might be merged?
Is there any problem with merging this? I can probably help out if there is something more needs doing. This is blocking my usecase and I would love to get away from using llama.cpp directly.
Just bumping this again to see if it can be merged soon. I'm sure logprobs would be useful for a lot of people.
hi @bmizerany , can this PR be merged? It's a very useful feature missing from Ollama currently
Thank you for your PR. We're working on things this could break or might be broken by, so we're going to wait until we've had more time to make sure this lines up with those. Thank you for your patience and support!
Thank you for your PR. We're working on things this could break or might be broken by, so we're going to wait until we've had more time to make sure this lines up with those. Thank you for your patience and support!
Sorry for my ignorance but this is just exposing the core functionality of the library this project is a wrapper for.
Mind linking that conversation? I'm happy to help
Thank you for your PR. We're working on things this could break or might be broken by, so we're going to wait until we've had more time to make sure this lines up with those. Thank you for your patience and support!
Hi @bmizerany, thanks for the update. Do you have an approximate timescale for this work?
@briancleland We do not currently have a timeline to support this.
We're working on things this could break or might be broken by, so we're going to wait until we've had more time to make sure this lines up with those.
@bmizerany On reflection, I think my last question was not very clear. What I meant was, do you know when the work referred to above might be completed?
@briancleland @janpf Anyone else feel like it might be time for a hard fork with emphasis on full LLM-capabilities exposed in APIs?
@caj-larsson I would hope that wouldn't be necessary, but the unhelpful response to this PR is a bit puzzling.
Ollama is an amazing piece of tech! I have used it extensively for quite some time -- huge kudos to the team! I'm also interested in logprobs being exposed for two reasons
- To maintain compatibility with the OpenAI library which does include it
- To provide better insight into the quality of the result.
Very much looking forward to seeing this integrated! Happy to help but I would expect this is not a starter-contributor ticket.
Hello there, I'm also really looking forward to using logprobs for my project. Any news on when this feature will be integrated? Greetings
Hello there, I'm also really looking forward to using logprobs for my project. Any news on when this feature will be integrated? Greetings
Hello, Same for evaluating models with ollama through lm-evaluation-harness. Very much looking forward to seeing this integrated!
@jmorganca Any thoughts on how to progress this?
I would like to push for this feature as well. It's critical for confidence evaluation using both logprobs and entropy. (e.g., https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00598/117737)
Top layer logits would be very useful for potential transfer learning, for example with additional layers.
@briancleland @janpf Anyone else feel like it might be time for a hard fork with emphasis on full LLM-capabilities exposed in APIs?
I am like 99.99% with you here, where the missing 0.01% is not knowing any golang at all. It seems the lesser effort would be figuring how to expose capabilities already in llama.cpp than to deal with managing the model ecosystem without ollama.
I can not believe that implementing such simple and useful feature has been pending for over 7 months with no clear path forward! This is more like a simple print statement! I offer my help to fixing this issue if I know whats pending.
I can not believe that implementing such simple and useful feature has been pending for over 7 months with no clear path forward! This is more like a simple print statement! I offer my help to fixing this issue if I know whats pending.
I also do not understand why ollama does not support logits. It is a fundamental aspect of an LLM. A lot of serious algorithms, that enhance the quality of the LLM output, rely on logits.
Please, Ollama team, do support logits ASAP. Thank you!
Get it together Ollama team. This has been sitting unmerged for 7 months.
Any news here? Why is this taking so long?